students academic performance using clustering technique

31
Students Academic Performance Knowledge Discovery from Data

Upload: saniacorreya

Post on 12-Apr-2017

230 views

Category:

Education


1 download

TRANSCRIPT

Students Academic Performance

Knowledge Discovery from Data

Introduction..

Our project aim is to find students academic performance

and find out whether there is any general pattern in their

marks and performance.

So here ,We are analyzing both internal and external

marks of a student.

We did the following KDD preprocessing steps to mine

our data.

Learning the application domain

Learning the application domain is the first step in KDD process .

Need to have a clear understanding about the application

domain and our objectives.

The institution considered for mining is MCA batch of Rajagiri

College of Social Sciences.

We collected all previous year academic record from the

department of computer science

Create a target data set: data selection

 We selected 2007-2010 batch marks for analysing the

pattern.

There were around 45 records(45 students).

Both the internal and external marks of each student were

selected, in order to find out the performance pattern.

Internal & External Dataset

Data cleaning & preprocessing

Data cleaning is the step where noise and irrelevant data are

removed from the large data set.

This is a very important pre-processing step because our

outcome would be dependent on the quality of selected data.

Remove duplicate records, enter logically correct values for

missing records(absent students), remove unnecessary data

fields and standardize data format.

There was no much duplicate data or unnecessary data in the

collected record . The dataset was partially cleaned.

Student internal mark and external mark were stored in

different records.

By applying data integration these records were integrated

into one record.

The new dataset consist of internal mark details and external

mark details of each student in one record.

Data reduction & transformation

Data is transformed into appropriate form for making it ready for

data mining step.

The dataset contains marks of 5 theory paper and 2 lab paper of

all 5 semesters.

These marks are transformed into sum of internal marks and sum

of external marks of each student for the easiness of analysing

the pattern.

Cluster Analysis The data mining technique we used here is clustering. A cluster is a collection of data objects that are similar to

one another within same cluster and are dissimilar to objects in other cluster.

We first partitioned the set of data into groups based on data similarity and then assign labels

Choosing functions of data mining

K-MEANS Partitioning The K-means algorithm takes input parameter k and

partitions the set of n objects into k clusters. Here we selected no: of cluster as 4 Objects are distributed to a cluster based on cluster

center to which it is nearest. For each semester we found out the clusters separately

and labeled them as students Excellent, Good, Fair and Poor

Choosing mining algorithms

The Tool used for pattern evaluation is ORANGE

Orange Cluster Analysis

No of cluster selected is 4

Semester 1

poor

Fair

Good

Excellent

Semester 2

Semester 3

Semester 4

Semester 5

Centroid Analysis

Semester 1

Semester 2

Semester 3

Semester 4

Semester 5

Combined Centroid Analysis

Data mining search for patterns of interest

From the mining process we found that “All the 5 semester clusters followed the same pattern of performance”.

A student with high internal mark has higher external marks and a student with less internal marks has less external marks.

There is a direct relation between the internal and the external marks.

At some case this evaluation is not valid, cases like Being absent for internal exam and scoring high marks for

the externals (vice versa)

CONCLUSION

A students performance in his university exam can be predicted with the help of his internal marks. There is a direct relation between the internal and the external marks.

A student with low internals will get low marks for externals too

Use of discovered knowledgerepresentation

Thank You