umap16: a framework for dynamic knowledge modeling in textbook-based learning
TRANSCRIPT
A Framework for Dynamic Knowledge Modeling in Textbook-Based Learning
Yun Huang1, Michael Yudelson2, Shuguang Han3,
Daqing He3, Peter Brusilovsky3
1Intelligent Systems Program , 3School of Information Sciences , University of Pittsburgh
2Human-Computer Interaction Institute, Carnegie Mellon University
1
Motivation
Goal: Dynamically estimate learners’ knowledge levels
3
1. Problem solving tutor and data
2. Expert hand-crafted knowledge
components (KCs)
3. Probabilistic student model
• Bayesian network-based, e.g.,
Knowledge Tracing
• Logistic regression-based, e.g.,
Performance Factor Analysis
1. Problem solving tutor and data
2. Expert hand-crafted knowledge
components (KCs)
3. Probabilistic student model
• Bayesian network-based, e.g.,
Knowledge Tracing
• Logistic regression-based, e.g.,
Performance Factor Analysis
1. Problem solving tutor and data
2. Expert hand-crafted knowledge
components (KCs)
3. Probabilistic student model
• Bayesian network-based, e.g.,
Knowledge Tracing
• Logistic regression-based, e.g.,
Performance Factor Analysis
How is it usually done?
Motivation• Large numbers of reading interactions are available from learners, but not used
in canonical student modeling.
5
• Prior studies of learner modeling:
• Use problem solving performance data
not always available/enough/in-time for reading!
• Estimate knowledge at the end of all reading activities not in-time!
• Use expert-crafted knowledge components (KCs)
We not only read in well-designed systems, but in an open corpus!
• We propose:
• Reconstruct popular student models for dynamic knowledge estimation
for reading activities
• Automatic text analysis for KC extraction
• Start from a textbook-based learning environment
Outline
• Motivation
• Methods•Problem Statement•Student Models for Reading•Extracting Knowledge Components
• Experiments and Results
• Discussion and Conclusion
6
Problem Statement
• Main idea: Use reading time to infer latent knowledge and use latent knowledge to predict reading time
Recommend materials where learners have low knowledge levels
Avoid recommending materials that we predict short reading time
• Make several assumptions of the learning process (see next slide)
• Evaluate the KC and student models by
• Predictive accuracy of reading time
provide insight for doing user studies
• KC model’s semantics, granularity, and the ability to capture transfer learning
7
Learning Process Assumptions
• A document (doc) consists of a set of KCs; each student can be in learnedor unlearned state for each KC; each doc is labeled as Skim or Read.
8
• If a doc contains KCs that a student
has already learned, the student is
likely to skim the doc; otherwise, the
student is likely to read the doc
carefully.
• A student’s knowledge of a KC grows
through consistent reading of docs
that contain the same KC.
Outline
• Motivation
• Methods•Problem Statement
•Student Models for Reading
•Extracting Knowledge Components
• Experiments and Results
• Discussion and Conclusion
9
Knowledge Tracing-Based Model (proposed model)
10
Train, predict and update three parts
R R S
R S
D1
KC1
D2
KC1
KC2
D3
KC1
KC2
Read Read Skim
KC1
KC2
3) Train parameters per KC
using a hidden Markov model
1. Train:1) Gather doc reading sequences
2) Propagate observations from the
doc level to the KC level, form sequences per KC per student
Knowledge Tracing-Based Model (proposed model)
2. Predict:
1) Get a new doc
11
D4
KC1
KC2
Read/Skim?
S: 0.8
S: 0.85
S: 0.825
Skim!
?
KC1 L: 0.7
?
KC2 L: 0.8
3. Update:
1) Observe the actual reading time
of this doc, update related KCs’ knowledge by Bayes’ theorem
L: 0.8
L: 0.95
S
S
2) Conduct prediction within each KC
3) Predict reading time on a doc by
averaging the predictions across KCs
Skim
Baselines
12
Logistic Regression-Based Models (high baselines)• Performance Factor Analysis-based (PFA-based): The probability of
Skimming a doc is modeled as a logistic function of each underlying KC’s
initial easiness, previous Skimming and Reading activities.
• Addictive Factor Model-based (AFM-based): The probability of Skimming a
doc is modeled as a logistic function of a student’s ability, each underlying
KC’s initial easiness and previous reading activities.
Majority Class Model (low baseline)• 67% activities are labeled as Read on our dataset, so this model always
predicts Read.
Outline
• Motivation
• Methods•Problem Statement
•Student Models for Reading
•Extracting Knowledge Components
• Experiments and Results
• Discussion and Conclusion
13
We consider three ways:
• Word: Treating each word (term) in docs as a KC; each document is mapped to multiple KCs.
• Chapter: Treating each chapter in textbooks as a KC; each document is mapped to one KC.
• Latent Topic: Treating each latent topic extracted by LDA (latent Dirichlet allocation) model as a KC; each document is mapped to multiple KCs based on a probability threshold.
Extracting Knowledge Components
14
Outline
• Motivation
• Methods•Problem Statement
•Student Models for Reading
•Extracting Knowledge Components
• Experiments and Results
• Discussion and Conclusion
15
SystemReading Circle for Interactive Systems Design course at the University of Pittsburgh with 10,188 activities on 325 docs from 289 students.
16
Experiment Setup
• Discretization of Time: 33rd percentile of the time distribution per document constitutes a cutoff per doc to differentiate Skimand Read (consistent with findings reported in literature).
• Cross-validation and Evaluation: 10-fold student stratified CV; evaluate the prediction of reading time; compute the average RMSE and AUC across 10 folds and reported a 95% CI.
• Tools: hmm-scalable, modified LIBLINEAR (https://github.com/IEDMS/)
17
KT-based model varying KC models
18
Proposed KT-based model consistently outperforms the majority class baseline, showing that
• the hypothesized learning process is reasonable (to some extent)
• the student model is quite robust across different KC models
LR-based model varying KC models
19
LR-based models consistently outperform the majority class baseline, confirming
that the hypothesized learning process is reasonable (to some extent).
KT-based vs. LR-based models
20
Proposed KT-based model also consistently outperforms LR-based models (high baselines), with the additional benefit of providing explicit knowledge estimation for each KC.
Latent topic-based KC model
Latent topic-based KC model provides the highest predictive ability. Although not statistically significant better, its level of granularity, semantic relation modeling, and transferring ability might offer significant benefits to real-world personalization.
21
Discussion and Conclusion
• Formulated the problem of modeling dynamic knowledge in reading as a dynamic reading time prediction problem, which is useful when real-time problem performance data is not available.
• Consider evaluating by available problem performance data and classroom study.
• Proposed KT-based model is promising for modeling learning in reading.
• Consider improvement in handling multiple KCs and incorporating richer reading behaviors in the future.
23
Discussion and Conclusion
• Use simple discretization to handle reading time.
• Consider individual student differences in reading speed in the future.
• Latent topic-based KC extraction is promising.
• Consider combining with textbook structure in the future.
• This work is still very preliminary, but the framework is generalizable to a broader context of open-corpus personalized learning. Welcome to improve!
24
Acknowledgements
•National Science Foundation Cyber-Human Systems (CHS) Program under Grant IIS-1525186.
•National Science Foundation support for UMAP students
25