causal data mining

15
Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon

Upload: tommy96

Post on 22-Nov-2014

915 views

Category:

Documents


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Causal Data Mining

Causal Data Mining

Richard Scheines

Dept. of Philosophy, Machine Learning, &

Human-Computer Interaction

Carnegie Mellon

Page 2: Causal Data Mining

1. Predictive Data Mining

Finding predictive relationships in data

– What feature of student behavior predicts learning

– Who will default on credit cards

– Who will get an “A” in your course

– Which HS students will do well at CMU

– Do students cluster by “learning style”

Page 3: Causal Data Mining

Causal Data Mining

Finding causal relationships in data

– What feature of student behavior causes learning

– What will happen when we make everyone take a

reading quiz before each class

– What will happen when we program our tutor to

intervene to give hints after an error

Page 4: Causal Data Mining

Predictive Data MiningX1 X2 X3 . . Xk Y

1 1.7 28 M . . 2.4 1

2 2.0 11 F . . 1.1 0

3 1.9 17 F . . 1.1 1

. . . . . . . .

. . . . . . . .

N 2.8 12 M . . 1.8 0

Data Mining Search

Predictive Model

Y = f(X1, X2, …Xk)

Page 5: Causal Data Mining

Predictive Data Mining

Data Mining Search

Predictive Model

Y = f(X1, X2, …Xk)

Model Classes

1. Simple Regression

2. Locally Weighted Regression

3. Logistic Regression

4. Neural Nets

5. Vector Support Machines

6. Decision Trees

7. Bayes Net

8. Naïve Bayes Classifier

9. Independent Components

10. Clustering

11. Etc.

Page 6: Causal Data Mining

Predictive Data Mining

Predictive Model under Constraints

Y = f(X1, X2, …Xk),

e.g., f Additive functions

Data Mining Search

Page 7: Causal Data Mining

Predictive Data Mining

Predictive Model under Constraints

Y = f(X1, X2, …Xk),

Or

Probability Model under Constraints:

P(Y | X1, X2, …, Xk), where P Gaussian, with mean 0

Data Mining Search

Page 8: Causal Data Mining

Predictive Data Mining

Decision Tree Search

Age

>57

57

X-Ray

Lab2

Pos

Neg.

Lab2

1.8

>1.8

P(Hosp.) = .78

Lab1

P(Hosp.) = .59 >1.4

1.4 P(Hosp.) = .10

P(Hosp.) = .66

P(Hosp.) = .75

P(Hosp.) = .05 2.3

>2.3

Page 9: Causal Data Mining

Predictive Data Mining ≠

Causal Data Mining

P(Y | X1, X2, …, Xk)

P(Y | X1set, X2, …, Xk)

Conditioning is not the same as intervening

Page 10: Causal Data Mining

Causal Discovery

Statistical Data Causal Structure

Background Knowledge

- X2 before X3

- no unmeasured common causes

X3 | X2 X1

Independence Relations

Data

Statistical Inference

X2 X3 X1

Equivalence Class of Causal Graphs

X2 X3 X1

X2 X3 X1

Discovery Algorithm

Causal Markov Axiom (D-separation)

Page 11: Causal Data Mining

Causal Discovery Software TETRAD IV

www.phil.cmu.edu/projects/tetrad

Page 12: Causal Data Mining

Full Semester Online Course in Causal & Statistical Reasoning

Page 13: Causal Data Mining

Full Semester Online Course in Causal & Statistical Reasoning

• Course is tooled to record certain events: Logins, page requests, print requests, quiz attempts, quiz

scores, voluntary exercises attempted, etc.

• Each event was associated with attributes: Time student-id Session-id

Page 14: Causal Data Mining

Printing and Voluntary Comprehension Checks: 2002 --> 2003

.302

-.41

.75

.353

.323

pre

print voluntary questions

quiz

final

2002

-.08

-.16

.41

.25

pre

print voluntary questions

final

2003

Page 15: Causal Data Mining

15

References

• Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)

• Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press

• Shih, B., Koedinger, K., & Scheines, R. (2008). A Response Time Model for Bottom-Out Hints as Worked Examples. Proceedings of the First Educational Data Mining Conference.

• Shih, B., Koedinger, K., and Scheines, R. (2007) "Optimizing Student Models for Causality." in Proceedings of the 13th International Conference on Artificial Intelligence in Education.

• Arnold, A., Beck, J., and Scheines, R. (2006). "Feature Discovery in the Context of Educational Data Mining: An Inductive Approach." Proceedings of the AAAI2006 Workshop on Educational Data Mining, Boston, MA.

• Scheines, R., Leinhardt, G., Smith, J., and Cho, K. (2005) "Replacing Lecture with Web-Based Course Materials, Journal of Educational Computing Research, 32, 1, 1-26.