presentation slide_preprocessing of academic data for mining association rule [wadm 2013]
TRANSCRIPT
Overview
Main Objective of Our Research
Concept of KDD
Methods of Preprocessing Academic Data for Mining
Data Analysis
Relational Database
Universal Database
Synthetic Data Population
Data Transformation
Interested Association Rules for Academic Data Preprocessing of Academic Data for Mining Association Rule 2
Main Objective of Our Research
To get knowledge and find the correlation of several explicit & implicit factors related to:
Students’ academic progress
Potentiality decay of students
Abandonment
Why do students drop out before Graduation ?
Retention
Why does students’ extended continuation prevail ?
Preprocessing of Academic Data for Mining Association Rule 3
Concept of KDD
Knowledge Discovery and
Data mining Process
Data
Target Data
Preprocessed Data
Transformed Data
Patterns/ Models
Knowledge
Selection
Preprocessing
Transformation
Data mining
Interpretation Evaluation
4
Why Preprocessing before Data Mining ?
Reasons for proposing a preprocessing technique before applying mining association rules in academic data :
Proper interpretation of the results of mining is essential
to ensure that useful knowledge is derived from the data.
Blind application of data-mining methods can be a
dangerous activity, easily leading to the discovery of
meaningless and invalid patterns.
Preprocessing of Academic Data for Mining Association Rule 5
Methods of Preprocessing Academic
Data for Mining
Data Analysis of BIIS Database
Personal Information Academic Information
Age SSC or equivalent GPA, Board
Gender HSC or equivalent GPA, Board
Origin Area(Birth Place)
Admission Year / Batch
Present Address Department
Hall Resident/Attached
Current Level/Term
Current CGPA
Term wise CGPA
Subject wise detailed Grade
Credit Hour Completed
Preprocessing of Academic Data for Mining Association Rule 6
Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Age
Origin Area
Record of Taken Courses
Experience of Teachers
Hall Resident/Attached
Term Duration
SSC & HSC GPA/Board
Gender
CGPA
Factors related to Academic Performance of Student
Academic Performance
Preprocessing of Academic Data for Mining Association Rule 7
Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Age
Origin Area
Credit Hour Ratio
Session Jam
Hall Resident/Attached
Term Duration
SSC & HSC GPA/Board
Gender
CurrentCGPA
Abandonment/ Retention of student
Stay Duration
Factors related to Abandonment/Retention of student
Preprocessing of Academic Data for Mining Association Rule 8
Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Factors related to Condition of Academic Institution
Rate of Student Retention
Average CGPA of all Students
Experience of Teachers
Rate of Student Abandonment
Research & Publications
Condition of Academic Institution
Preprocessing of Academic Data for Mining Association Rule 9
Methods of Preprocessing Academic Data for Mining
(Contd.)
Relational Database
Student Course Grade Sheet
represents achieves
Finding Correlation between performance of different courses
Preprocessing of Academic Data for Mining Association Rule 10
Methods of Preprocessing Academic Data for Mining
(Contd.)
How we have populated data in universal database?
Let us consider a 3 credit course CSE 303
Now we assume 5 possible scenarios:
Universal Database & Synthetic Data Population
Preprocessing of Academic Data for Mining Association Rule 11
A student appears class tests(CT) having attendance more than 60%, appeared term final examinations.
A student appeared CT but attendance is less than 60% and appeared term final examination.
Class test and attendance are carried over and appeared term final examination.
A student appeared CT and attendance is more than 60% but not appeared term final examination.
A student attended less than 60% of classes and did not appear both in CT and term final examination.
Methods of Preprocessing Academic Data for Mining
(Contd.)
Two algorithm have been developed to populate the universal table :
Synthetic_Generation ( )
Generate_Grade ()
Universal Database & Synthetic Data Population (contd.)
Student_Id CSE303_secA
CSE303_secB
CSE303_CT
CSE303_
Attendance
CSE303_
Total
CSE303_Grade
… 0805001 90 75 55 30 250 A+ 0805002 85 70 45 25 225 A
… … … … … … …
Records of all taken courses of corresponding student ID are generated synthetically in a single row of the universal table.
Preprocessing of Academic Data for Mining Association Rule 12
Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Transformation
Definition Credit Hour Range
SecA_high or SecB_high 3 >=75 && <=105
SecA_avg or SecB_avg 3 >=60 && <75
SecA_low or SecB_low 3 < 60
CT_high 3 >=48 && <=60
CT_average 3 >=36 && <=48
CT_low 3 < 36
Grade_high 3 >=225 && <=300
Grade_average 3 >=180 && < 225
Grade_low 3 < 180
Transformation rule table for 3.0 credit course
Student_ ID
SecA_high
SecA _average
SecA_low
SecB _high
SecB _average
SecB_ low
CT_high
CT _average
CT _low
Grade_high
Grade_ average
Grade_ low
0805001 1 0 0 1 0 0 1 0 0 1 0 0 0805002 1 0 0 0 1 0 0 1 0 1 0 0
… … … … … … … … … … … … …
Transformed table from universal table
Preprocessing of Academic Data for Mining Association Rule 13
Association Rules for Academic Data
No. Interested Association Rule Purpose
1. Course_No => CGPA_high Performance of Individual Course
2. Course_No => CGPA_low
3. Sec_A_high => CGPA_high Impact of Section of Answer Script
4. Sec_B_high => CGPA_high
5. CT_high =>CGPA_high Impact of Class test
6. CT_low => CGPA_low
7. Hall_Resident =>CGPA_low Impact of Residence
8. Attached =>CGPA_high
9. Course_No_1=> Course_No_2 Correlation of courses
10. (Course_No_1,Course_No_2) => Course_No_3
11. Permanent_Address_City =>CGPA_high Impact of locality
12. Permanent_Address_Rural =>CGPA_low
Preprocessing of Academic Data for Mining Association Rule 14
Future Work
Academic Performance
Family Background Previous Academic Record Seat Allotment in Hall Offering Scholarship
Abandonment/Retention
Stay Duration Session Jam Unwanted leaves Long term break
Condition of Institution
Average CGPA of all students Term completion rate Abandonment/retention rate Research & Publications
Developing new mining algorithm which will be tested
using the synthetic dataset
Collecting real data from BIIS and using without disclosing privacy to discover the Knowledge
Preprocessing of Academic Data for Mining Association Rule 15
Conclusions
Applies association rule mining algorithms to transform continuous
valued attribute into resemble the required educational knowledge
Guides to discover the required knowledge using the realistic
dataset and apply them in real life scenario
Developing Model using BIIS data but can be generalized
for application to any higher educational institution
Preprocessing of Academic Data for Mining Association Rule 16