weakly supervised models of aspect-sentiment for online course discussion forums arti ramesh shachi...
TRANSCRIPT
Weakly Supervised Models of Aspect-
Sentiment for Online Course Discussion ForumsARTI RAMESHSHACHI H. KUMARJAMES FOULDSLISE GETOOR
2
• Massive: attracts thousands of participants• Open: open access, content, and assessment• Online: hosted online by education companies
in partnership with top universities
3
Classroom
• Classroom – Face-to-face interaction
between instructor and students
MOOCs
• MOOC Discussion Forums– Primary means of
interaction between instructor and students
• Large number of students, posts: Hard to monitor manually• Posts discuss problems in course - course material, errors,
feedback
4
Example MOOC PostsMOOC Post Fine-grained TopicThe video is very choppy. Can somebody fix this?
Lecture-Video
Will subtitles be made available for the lectures for this week? I liked the transcripts from last week.
Lecture-Subtitles
Will everyone get a certificate or only people in the signature track?
Certificate
When is quiz 4 due? Quiz-Deadlines
5
Predicting fine-grained problems: Challenges
• Labeled data hard to obtain– 5-10% posts contain problems – Privacy concerns around data sharing– Problems differ across courses
• Unsupervised/weakly supervised approaches desirable– System not fine-tuned to one course, but can adapt
across courses
6
Related WorkAspect-sentiment in Online Reviews• Semi-supervised generative model, with seed words
to identify aspect clusters [Mukherjee et al., 2012]
• Unsupervised Aspect-Sentiment Model for Online Reviews [Brody et al., 2012]
• Hierarchical Aspect-Sentiment Model for Online Reviews [Kim et al. 2013]
MOOCs• Predicting Instructor Intervention in MOOC
Forums[Chaturvedi et al., 2014]
7
SeededLDA for MOOC ForumsSeededLDA• Guide topic discovery by specifying representative seed words
• seededLDA uses seeds to bias topic-word and word-document distributions
• seededLDA gathers words related to seed wordsSeededLDA for MOOCs• Many classes but a common set of seed words• Seed words for MOOCs from syllabus and forums
Jagarlamudi et al. 2010
8
Hinge-loss Markov Random Fields &Probabilistic Soft Logic• Hinge-loss Markov Random Fields (HL-MRFs)– Logic-based MRFs that can reason about both
discrete and continuous graph data scalably and accurately
– Efficient Inference: convex optimization in continuous space
• Probabilistic Soft Logic (PSL)– Templating language for HL-MRFs– Weighted logical rules to model dependencies– Continuous variables in [0,1]
Bach et al. 2012
9
• Analogous to predicting aspect-sentiment in online reviews
• Aspect hierarchy connecting course elements• HL-MRF framework – Combining different features– Encoding coarse-to-fine aspect hierarchy– Encoding dependencies between aspect and sentiment
• Jointly modeling aspect and sentiment
Predicting fine-grained problems and sentiment: Joint Prediction Problem
10
Our Contributions• Identify fine-grained aspects in online courses• Extract course-specific features from posts
using SeededLDA• Construct coarse-to-fine aspect hierarchy to
model aspect dependencies• Construct weakly-supervised joint model for
aspect-sentiment using HL-MRFs• Validate system using crowdsourced posts
sampled from 12 courses
11
MOOC Aspect-Sentiment Models: SeededLDA
LECTURE: lecture, video, download, transcript, slide, noteQUIZ: quiz, assignment, question, midterm, exam, submissionCERTIFICATE: certificate, score, statement, signature SOCIAL: name, course, introduction, study, group
• Coarse Aspect seeds
• Sentiment seedsPOSITIVE: interest, exciting, thank, great, happy, glad, enjoyNEGATIVE: problem, difficult, error, issue, unable, misunderstand NEUTRAL: coursera, class, hello, everyone, greet, name
12
SeededLDA Model• Fine Aspect seeds
LECTURE-VIDEO: video, problem, download, play, player,
LECTURE-AUDIO: volume, low, headphone, sound, audio, hear
LECTURE-LECTURER: professor, fast, speak, pace, follow, speed
LECTURE-SUBTITLES: transcript, subtitle, slide, note, lecture,
LECTURE-CONTENT: typo, error, mistake, wrong, right, incorrect
QUIZ-CONTENT: question, challenge, difficult, understand, typo
QUIZ-SUBMISSION: submission, submit, quiz, error, unable, resubmit
QUIZ-GRADING: answer, question, answer, grade, assignment, quiz
QUIZ-DEADLINE: due, deadline, miss, extend, late
13
PSL-Joint: Combining Features
SeededLDA score for fine aspect and coarse aspect to predict fine aspect of post P
14
PSL-Joint: Combining Features
SeededLDA score for sentiment and fine aspect to predict fine aspect
15
PSL-Joint: Encoding Dependencies
Dependency between coarse aspect and fine aspect
16
PSL-Joint: Encoding Dependencies
Dependency between sentiment and fine aspect
17
Experimental Evaluation
Model Lecture Quiz Certificate Social
SeededLDA 0.632 0.657 0.459 0.654
PSL-Joint 0.630 0.706 0.621 0.659
Model Positive Negative NeutralSeededLDA 0.182 0.517 0.356
PSL-Joint 0.189 0.615 0.434
SeededLDA and PSL-Joint for sentiment
F-1 scores for SeededLDA and PSL-Joint for coarse aspects
18
Experimental Evaluation
Model Lecture Quiz Certificate Social
SeededLDA 0.632 0.657 0.459 0.654
PSL-Joint 0.630 0.706 0.621 0.659
Model Positive Negative NeutralSeededLDA 0.182 0.517 0.356
PSL-Joint 0.189 0.615 0.434
SeededLDA and PSL-Joint for coarse aspects
SeededLDA and PSL-Joint for sentiment PSL-Joint outperforms SeededLDA for most coarse aspects and sentiment
19
Experimental Evaluation
Model Content Video Audio Lecturer Subtitles
SeededLDA 0.08 0.240 0.684 0.06 0.397
PSL-Joint 0.410 0.485 0.582 0.323 0.461
Model Content Submission Deadlines GradingSeededLDA 0.011 0.437 0.214 0.514
PSL-Joint 0.36 0.416 0.611 0.550
Fine-grained aspects under coarse aspect lecture
Fine-grained aspects under coarse aspect quiz
20
Experimental Evaluation
Model Content Video Audio Lecturer Subtitles
SeededLDA 0.08 0.240 0.684 0.06 0.397
PSL-Joint 0.410 0.485 0.582 0.323 0.461
Model Content Submission Deadlines GradingSeededLDA 0.011 0.437 0.214 0.514
PSL-Joint 0.36 0.416 0.611 0.550
Fine-grained aspects under coarse aspect “lecture”
Fine-grained aspects under coarse aspect “quiz”
PSL-Joint distinguishes between lecture-content and quiz-content
21
Experimental Evaluation
Model Content Video Audio Lecturer Subtitles
SeededLDA 0.08 0.240 0.684 0.06 0.397
PSL-Joint 0.410 0.485 0.582 0.323 0.461
Model Content Submission Deadlines GradingSeededLDA 0.011 0.437 0.214 0.514
PSL-Joint 0.36 0.416 0.611 0.550
Fine-grained aspects under coarse aspect “lecture”
Fine-grained aspects under coarse aspect “quiz”
Significant improvement in scores for lecture-lecturer and quiz-deadlines
22
Interpreting PSL-Joint Predictions“There is a typo or other mistake in the assignment instructions (e.g. essential information omitted).”
SeededLDA Prediction: Lecture-contentPSL-Joint Prediction: Quiz-content
“Thanks for the suggestion about downloading the video and referring to the subtitles. The audio is barely audible, even when the volume is set to 100%”
SeededLDA Prediction: Lecture-subtitlesPSL-Joint Prediction: Lecture-audio
23
Conclusion: Fine-grained aspect-sentiment in MOOC forums
• Automatically detecting problems in forum posts useful for instructors
• Weakly supervised probabilistic framework to automatically detect aspect and sentiment in online courses– SeededLDA and PSL-Joint models as means to encode domain
information and predict aspect and sentiment• PSL-Joint significantly outperforms SeededLDA for many
fine aspects, coarse aspects, and sentiment– Structural dependencies among aspect and sentiment helps
in prediction