presentation slide_preprocessing of academic data for mining association rule [wadm 2013]

18
Preprocessing of Academic Data for Mining Association Rule

Upload: tanvin

Post on 15-Aug-2015

28 views

Category:

Presentations & Public Speaking


0 download

TRANSCRIPT

Preprocessing of Academic Data for Mining Association Rule

Overview

Main Objective of Our Research

Concept of KDD

Methods of Preprocessing Academic Data for Mining

Data Analysis

Relational Database

Universal Database

Synthetic Data Population

Data Transformation

Interested Association Rules for Academic Data Preprocessing of Academic Data for Mining Association Rule 2

Main Objective of Our Research

To get knowledge and find the correlation of several explicit & implicit factors related to:

Students’ academic progress

Potentiality decay of students

Abandonment

Why do students drop out before Graduation ?

Retention

Why does students’ extended continuation prevail ?

Preprocessing of Academic Data for Mining Association Rule 3

Concept of KDD

Knowledge Discovery and

Data mining Process

Data

Target Data

Preprocessed Data

Transformed Data

Patterns/ Models

Knowledge

Selection

Preprocessing

Transformation

Data mining

Interpretation Evaluation

4

Why Preprocessing before Data Mining ?

Reasons for proposing a preprocessing technique before applying mining association rules in academic data :

Proper interpretation of the results of mining is essential

to ensure that useful knowledge is derived from the data.

Blind application of data-mining methods can be a

dangerous activity, easily leading to the discovery of

meaningless and invalid patterns.

Preprocessing of Academic Data for Mining Association Rule 5

Methods of Preprocessing Academic

Data for Mining

Data Analysis of BIIS Database

Personal Information Academic Information

Age SSC or equivalent GPA, Board

Gender HSC or equivalent GPA, Board

Origin Area(Birth Place)

Admission Year / Batch

Present Address Department

Hall Resident/Attached

Current Level/Term

Current CGPA

Term wise CGPA

Subject wise detailed Grade

Credit Hour Completed

Preprocessing of Academic Data for Mining Association Rule 6

Methods of Preprocessing Academic Data for Mining

(Contd.)

Data Analysis (contd.)

Age

Origin Area

Record of Taken Courses

Experience of Teachers

Hall Resident/Attached

Term Duration

SSC & HSC GPA/Board

Gender

CGPA

Factors related to Academic Performance of Student

Academic Performance

Preprocessing of Academic Data for Mining Association Rule 7

Methods of Preprocessing Academic Data for Mining

(Contd.)

Data Analysis (contd.)

Age

Origin Area

Credit Hour Ratio

Session Jam

Hall Resident/Attached

Term Duration

SSC & HSC GPA/Board

Gender

CurrentCGPA

Abandonment/ Retention of student

Stay Duration

Factors related to Abandonment/Retention of student

Preprocessing of Academic Data for Mining Association Rule 8

Methods of Preprocessing Academic Data for Mining

(Contd.)

Data Analysis (contd.)

Factors related to Condition of Academic Institution

Rate of Student Retention

Average CGPA of all Students

Experience of Teachers

Rate of Student Abandonment

Research & Publications

Condition of Academic Institution

Preprocessing of Academic Data for Mining Association Rule 9

Methods of Preprocessing Academic Data for Mining

(Contd.)

Relational Database

Student Course Grade Sheet

represents achieves

Finding Correlation between performance of different courses

Preprocessing of Academic Data for Mining Association Rule 10

Methods of Preprocessing Academic Data for Mining

(Contd.)

How we have populated data in universal database?

Let us consider a 3 credit course CSE 303

Now we assume 5 possible scenarios:

Universal Database & Synthetic Data Population

Preprocessing of Academic Data for Mining Association Rule 11

A student appears class tests(CT) having attendance more than 60%, appeared term final examinations.

A student appeared CT but attendance is less than 60% and appeared term final examination.

Class test and attendance are carried over and appeared term final examination.

A student appeared CT and attendance is more than 60% but not appeared term final examination.

A student attended less than 60% of classes and did not appear both in CT and term final examination.

Methods of Preprocessing Academic Data for Mining

(Contd.)

Two algorithm have been developed to populate the universal table :

Synthetic_Generation ( )

Generate_Grade ()

Universal Database & Synthetic Data Population (contd.)

Student_Id CSE303_secA

CSE303_secB

CSE303_CT

CSE303_

Attendance

CSE303_

Total

CSE303_Grade

… 0805001 90 75 55 30 250 A+ 0805002 85 70 45 25 225 A

… … … … … … …

Records of all taken courses of corresponding student ID are generated synthetically in a single row of the universal table.

Preprocessing of Academic Data for Mining Association Rule 12

Methods of Preprocessing Academic Data for Mining

(Contd.)

Data Transformation

Definition Credit Hour Range

SecA_high or SecB_high 3 >=75 && <=105

SecA_avg or SecB_avg 3 >=60 && <75

SecA_low or SecB_low 3 < 60

CT_high 3 >=48 && <=60

CT_average 3 >=36 && <=48

CT_low 3 < 36

Grade_high 3 >=225 && <=300

Grade_average 3 >=180 && < 225

Grade_low 3 < 180

Transformation rule table for 3.0 credit course

Student_ ID

SecA_high

SecA _average

SecA_low

SecB _high

SecB _average

SecB_ low

CT_high

CT _average

CT _low

Grade_high

Grade_ average

Grade_ low

0805001 1 0 0 1 0 0 1 0 0 1 0 0 0805002 1 0 0 0 1 0 0 1 0 1 0 0

… … … … … … … … … … … … …

Transformed table from universal table

Preprocessing of Academic Data for Mining Association Rule 13

Association Rules for Academic Data

No. Interested Association Rule Purpose

1. Course_No => CGPA_high Performance of Individual Course

2. Course_No => CGPA_low

3. Sec_A_high => CGPA_high Impact of Section of Answer Script

4. Sec_B_high => CGPA_high

5. CT_high =>CGPA_high Impact of Class test

6. CT_low => CGPA_low

7. Hall_Resident =>CGPA_low Impact of Residence

8. Attached =>CGPA_high

9. Course_No_1=> Course_No_2 Correlation of courses

10. (Course_No_1,Course_No_2) => Course_No_3

11. Permanent_Address_City =>CGPA_high Impact of locality

12. Permanent_Address_Rural =>CGPA_low

Preprocessing of Academic Data for Mining Association Rule 14

Future Work

Academic Performance

Family Background Previous Academic Record Seat Allotment in Hall Offering Scholarship

Abandonment/Retention

Stay Duration Session Jam Unwanted leaves Long term break

Condition of Institution

Average CGPA of all students Term completion rate Abandonment/retention rate Research & Publications

Developing new mining algorithm which will be tested

using the synthetic dataset

Collecting real data from BIIS and using without disclosing privacy to discover the Knowledge

Preprocessing of Academic Data for Mining Association Rule 15

Conclusions

Applies association rule mining algorithms to transform continuous

valued attribute into resemble the required educational knowledge

Guides to discover the required knowledge using the realistic

dataset and apply them in real life scenario

Developing Model using BIIS data but can be generalized

for application to any higher educational institution

Preprocessing of Academic Data for Mining Association Rule 16

Any Question or Suggestion is Welcome

Preprocessing of Academic Data for Mining Association Rule 17