from expert-driven to data-driven adaptive learning
TRANSCRIPT
From Expert-Driven to Data-Driven Adaptive Learning
Peter Brusilovsky
School of Computing and Information, University of Pittsburgh
Adaptive (Personalized) Learning
• Take into account student individual features (knowledge, goals, personal traits…)
• Improve learning by providing different learning support (personalization!) to different students – Adaptive sequencing in CAI – Adaptive navigation support and adaptive
presentation in adaptive hypermedia – Mastery learning in Intelligent tutoring systems – Style-adapted Hypermedia
Brusilovsky, P. and Peylo, C. (2003) Adaptive and intelligent Web-based educational systems. International Journal of Artificial Intelligence in Education 13 (2-4), 159-172.
Exercise Sequencing in ELM-ART
Web
er,
G.
and
Bru
silo
vsky
, P
. (2
001)
ELM
-ART
: An
adap
tive
vers
atile
sys
tem
for
Web
-bas
ed in
stru
ctio
n.
Inte
rnat
iona
l Jou
rnal
of Art
ifici
al I
ntel
ligen
ce in
Ed
ucat
ion
12
(4)
, 35
1-38
4.
Mastery learning in Agebra Tutor
Koe
din
ger
, K
. R
., A
nd
erso
n,
J. R
., H
adle
y, W
. H
.,
and
Mar
k, M
. A
. (1
997)
Int
ellig
ent
tuto
ring
goe
s to
sc
hool
in t
he b
ig c
ity.
Inte
rnat
iona
l Jou
rnal
of Art
ifici
al
Inte
llige
nce
in E
duca
tion
8,
30-4
3.
Adaptive Textbook in InterBook
Bru
silo
vsky
, P.
, Ek
lund
, J.
, an
d Sch
war
z, E
. (1
998)
Web
-ba
sed
educ
atio
n fo
r al
l: A
too
l for
dev
elop
ing
adap
tive
cour
sew
are.
Sev
enth
Int
erna
tiona
l Wor
ld W
ide
Web
Con
fere
nce,
, Aus
tral
ia,
14-1
8 Apr
il 19
98,
pp.
291-
300.
Style-adaptive Hypermedia: AES-CS
Interface for field-independent learners Tria
nta
fillo
u,
E.,
Pom
por
tis,
A.,
an
d D
emet
riad
is,
S.
(200
3)
The
desi
gn a
nd t
he for
mat
ive
eval
uatio
n of
an
adap
tive
educ
atio
nal
syst
em b
ased
on
cogn
itive
sty
les.
Com
pute
rs a
nd E
duca
tion,
87
-103
.
Personalization needs knowledge
• Knowledge-based personalization – Domain model (network of skills and concepts - KCs) – Mapping between learning content (problems, problem
steps, textbook pages) and KCs – Student modeling rules (how action with content changes
student knowledge) – Personalization (what to do given specific knowledge state)
• Learning-style-based personalization – How to map an individual to a specific style
(questionnaire) – Which learning content or interface to offer to each style
(rules)
Domain and content models
Example 2 Example M
Example 1
Problem 1
Problem 2 Problem K
Concept 1
Concept 2
Concept 3
Concept 4
Concept 5
Concept N
Examples
Problems
Concepts
Expert Driven vs. Data-Driven
• Expert-Driven Personalized Leaning – Knowledge are provided by domain and learning
science experts in several ways – Good quality – Expensive, hard to scale, subject to biases
• Data-Driven Personalized Learning – Knowledge is extracted from data – Good potential (scalability, objectivity) – Main challenge is achieving good quality
Data-Driven Personalization
• Where to get knowledge – Wisdom of instructors: encapsulated in content,
structure, rules developed by – Wisdom of learners: encapsulated in traces left by
past learners – Tuned by success/failure rata
• How this knowledge could be used? – Empower humans - Visual Learning Analytics – Empower decision algorithms: Educational Data
Mining
Visual Learning Analytics
• The idea: Present data in visual form to student administrators helping them to make better decisions about learning process
• Support self-regulated learning • Provide navigation support for students • Show performance to instructors to make
decisions • Show data to administrators to redesign process
Educational Data Mining
• The idea: Feed data to various data mining and machine learning approaches to improve existing automated learning and discover important things for future improvements
• Better domain modeling • Better student modeling • Better adaptation approaches • Finding what works best for different groups and
students
Research at PAWS Lab, U of Pittsburgh
• http://adapt2.sis.pitt.edu/wiki/ • Social navigation in E-learning
– open social learner modeling • Data-driven individual differences
– problem solving genome • Domain modeling and latent concept discovery • Content modeling
– mining prerequisites and outcomes from textbooks • Data-driven student modeling • Open corpus adaptive hypermedia
Data-Driven Personalized Learning
a b
c d
• a: social navigation & open social learner modeling • c: problem solving genome • b, d: mining prerequisites and outcomes from
textbooks
Teachers Students
Better Interfaces
Personalized Decision-making
Wisdom of:
Social Navigation Support
• Students need personalized guidance (navigation support) to access right content in the right time
• Traditional knowledge-based navigation support requires considerable knowledge engineering
• Social navigation uses behavior of past users to guide new users
• Can we use “wisdom” extracted from the work of a community of learners to replace knowledge-based guidance?
• Knowledge engineering vs. data analysis
Knowledge Sea II (+ AnnotatEd)
Farzan, R. and Brusilovsky, P. (2008) AnnotatEd: A social navigation and annotation service for web-based educational resources. New Review in Hypermedia and Multimedia 14 (1), 3-32.
Open Social Student Modeling
• Key ideas – Make traditional student models open to the users – Allow students to compare themselves with class and
peers – Social navigation based on performance data
• Main challenge – How to design the interface to make an easy
comparison and provide social guidance and motivation
– We went through several attempts
Open Social Student Modeling
Interactive Demo
YouTube Demo
Brusilovsky, P., Somyurek, S., Guerra, J., Hosseini, R., Zadorozhny, V., and Durlach, P. (2016) Open Social Student Modeling for Personalized Learning. IEEE Transactions on Emerging Topics in Computing 4 (3), 450-461.
Impact of OSSM on Learning • A study in 2 sections of Database course • Student knowledge significantly increased in both
groups • The mean learning gain was in OSSM group • Students who used OSSM interface worked more
efficiently • OSSM significantly increases engagement with all kind
of learning content (2-4 times!) • Much higher retention in OSSM group (3 times!) • (Why engagement and retention is important?)
Problem-Solving Genome
• Key ideas – Individual differences important for understanding
students and adapting learning – "Old generation" of individual differences (i.e. learning
styles) not valuable in e-learning context – Could we use "data-driven" science extracting individual
differences from behavior data? • Main challenge
– How to process the data to find and use individual differences
• Our approach uses sequence mining and profiling based on the use of micro-sequences
Context: Parameterized Java Exercises
Some numbers change each time the exercise is loaded
Hard to game Exercise from QuizJet system
Labeling Steps (attempts)
Correctness: Success (S) or Failure (F) Time: Short (lowercase) or Long (uppercase)
– Using median of the distribution of time per exercise – Using different distributions for first attempt
label correctness time s success short S success long f failure short F failure long
Pattern mining
• Using PexSPAM algorithm with gap = 0 • Each possible pattern of length 2 or higher is
explored • Support of a pattern: proportion of sequences
containing the pattern (at least once) – Does not count multiple occurrences of the pattern within a
sequence
• Select all patterns with minimum support of 1%
The Problem Solving Genome
• Constructed a frequency vector over the 102 patterns (vector of size 102) for each student – Each common pattern is a gene
• The vector represents how frequently a student uses each of the micro patterns • The vector is an individual genome build of genes
Exploring the Genome • Stability
– Are the patterns stable on a student?
• Effect of complexity – Are the patterns different across
complexity levels? • Patterns of success
– Are successful students following different patterns?
Clustering by Genome
• Cluster students by their genomes and analyze different patterns – Between clusters – Between low and high students within each cluster
• Spectral Clustering with k = 2 – Larger eigen-gap with k = 2
Guerra, J., Sahebi, S., Lin, Y.-R., and Brusilovsky, P. (2014) The Problem Solving Genome: Analyzing Sequential Patterns of Student Work with Parameterized Exercises. In: J. Stamper, Z. Pardos, M. Mavrikis and B. M. McLaren (eds.) Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014) pp. 153-160.
• Cluster 1: confirmers (repeat short successes) • Cluster 2: non-confirmers (quitters)
Ordering patterns by difference magnitude (cluster 2 – cluster 1)
Using Cluster for Guidance • Successful patterns in each cluster
are closer to the other cluster – Successful confirmers tend to stop
after long success – Successful non-confirmers (c 2) tend
to continue after hard success • Extreme different patterns between
clusters are “harmful” • How it could be used for
personalization? – Identify student type – Offer different interface or discourage
poor behavior with recommendation
_FS_
Preferred type of online learning content E – Exercise T – Text tutorial X – Example V – Video tutorial
Open Corpus Adaptive Textbook
• Key ideas – Extract domain model (concepts) from multiple textbooks – Annotate each section/page with prerequisite and outcome
concepts – Trace student reading, estimate student knowledge – Use for reading interface, navigation support,
recommendation • One of the challenges
– How to leverage various indirect signals in textbooks to learn a general prerequisite/outcome model?
• Approach: Extracting concepts from encapsulated wisdom of book authors (KDD 2017 poster)
Resolving problems: HCI vs RecSys
Sources of distant supervision in textbooks • Supervision Source 1: Unit Cohesiveness
– Our hypothesis is that the author usually explains (i.e., outcome) a concept in one place (e.g., a chapter or a section)
• Supervision Source 2: Unit Titles – Our hypothesis is that the author of a textbook is
more likely to include the concept’s name in the title of a unit (e.g., chapter or section) if the concept is an outcome concept
Model 1: a concept is outcome in one place (cohesiveness)
xij yij
Latent variabledenoting the unit
in which a conceptis explained
zi
concept i
unit j
Features describingthe context of concept
within the unit
Latent variabledenoting whether
concept is prerequisiteor outcome in this unit
Textbook section index
Mod
el p
roba
bilit
y co
ncep
t is
ou
tcom
e in
sec
tion
Concept = ”conditional independence”
Ground truth
First mention of the concept is NOT
necessarily where it is explained!
Model 2: concept’s appearance in the title makes it more likely that the concept is an outcome
concept i
unit j
xij yij
Concept appearsin the title of the
unit
tij
8579
7673
7369
8378
7667
Biology Anatomy Chemistry Psychol. Economics
7874
8678
7266
7574
7674
7574
Biology
Anatomy
Chemistry
Psychol.
Economics
8380
7870
7270
8975
8269
7263
7474
7067
8985
7574
7573
7275
8382
9385
Text
book
TR
AIN
ED o
n Textbook TESTED
Prerequisite/Outcome models
learned are able to
generalize across domains
Leaving for the next time
• Domain Modeling and Latent topic discovery – Sahebi, S., Lin, Y.-R., and Brusilovsky, P. (2016) Tensor Factorization
for Student Modeling and Performance Prediction in Unstructured Domain. Proceedings of the 9th International Conference on Educational Data Mining (EDM 2016), pp. 502-505.
• Data-driven student modeling – González-Brenes, J. P., Huang, Y., and Brusilovsky, P. (2014)
General Features in Knowledge Tracing to Model Multiple Subskills, Temporal Item Response Theory, and Expert Knowledge. Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014), London, UK, July 4-7, 2014, pp. 84-91.
• Open Corpus modeling and personalization – Huang, Y., Yudelson, M., Han, S., He, D., and Brusilovsky, P.
(2016) A Framework for Dynamic Knowledge Modeling in Textbook-Based Learning. In: Proceedings of 24th Conference on User Modeling, Adaptation and Personalization (UMAP 2016), pp. 141-150.
Acknowledgements
• Joint work with – Rosta Farzan, Sharon Hsiao, Tomek Loboda – Sherry Sahebi, Julio Guerra, Roya Hosseini – Yun Huang, Daqing He, Igor Labutov
• NSF Grants – CAREER 0447083 – EHR 0310576 – IIS 0426021
• ADL.net support for OSSM work
Visit us in Pittsburgh to Learn More!
… or Read our Papers
• http://www.pitt.edu/~peterb/papers.html • https://www.researchgate.net/profile/
Peter_Brusilovsky