human cognitive emulation by lee rumpf. abstract this project attempts to accurately model human...
DESCRIPTION
Scope of Project/ Expected Results Many data points Decision trees and entropy calculations more data = better resultsTRANSCRIPT
Human Cognitive Emulation
By Lee Rumpf
Abstract
This project attempts to accurately model human responses to stimuli. Using a survey format, this research hopes to produce a unique response to a stimuli based on information gained about the user. Analyzed solely on its own, the ramification of this project can perhaps draw broad conclusions about groups of people and how they respond. When combined with other techniques of emulating human though patterns, computer programs can come closer to representing accurately human responses.
Scope of Project/Expected Results Many data points
Decision trees and entropy calculations
more data = better results
Similar Projects
Sandia National Labs is doing a much more complex version of this project with many different approaches adding up to a complete psychological profile.
Dr. Ann Speed is one of the lead psychologists at Sandia working on Cognitive Emulation.
Similar tree-formats used in past.
Approach
lisp!!(((())))
text-based, with an intranet poll for seniors with results returned in csv format.
evolutionary programming
Tree branches = number of possible survey responses
Construction
(defun id3 (examples target.attribute attributes) (let (firstvalue a partitions) (setq firstvalue (get.value target.attribute (first examples))) (cond ((every #'(lambda(e)(eq firstvalue (get.value target.attribute e))) examples) firstvalue) ((null attributes) (most.common.value target.attribute examples)) (t (setq partitions (loop for a in attributes collect (partition a examples))) (setq a (choose.best.partition target.attribute partitions)) (cons (first a) (loop for branch in (cdr a) collect (list (first branch) (id3 (cdr branch) target.attribute (remove (first a) attributes)))))))))
ID3 – the heart of Decision Tree workfirst developed in 1975 by J. Ross Quinlan at the University of Sydney
here in LISP
ResultsTIMESPORT = LOW SEX = MALE PHYSICS = N => YES = Y TIMEDRAMA = LOW TIMELAB = LOW => YES = HIGH => YES = NONE => NO = NONE TIMELAB = LOW => NO = HIGH => NO = NONE => YES = AVERAGE => YES = FEMALE => NO = NONE BIO = NO => NIL = Y SEX = MALE => NO = FEMALE YEAR = JUNIOR => NO = SENIOR => YES = N => NO = AVG YEAR = NONE => N = SOPHMORE TIMEVIDEO = LOW => YES = HIGH => YES = AVG => NO = JUNOR => YES = JUNIOR TIMELAB = LOW TIMEDRAMA = AVG => NO = HIGH => NO = NONE => YES = NONE => NO = HIGH => NO = SENIOR SEX = FEMALE => YES = MALE TIMELAB = LOW => NO = AVG => NO = NONE => NO= HIGH SEX = FEMALE CHEM = Y => NO = N TIMELAB = AVG => NO = NONE RACE = ASIAN => NO = WHITE => YES = MALE TIMELAB = AVG RACE = OTHER => NO = ASIAN TIMEDRAMA = NONE => YES = LOW => NO = WHITE => YES = HIGH => NO = LOW => YES = NONE PHYSICS = Y => YES = N TIMEVIDEO = LOW => NO = NONE YEAR = SENIOR => NO = SOPHMORE => YES = AVG => YES
After less than 100 data points, the decision tree has already become relatively accurate. The resulting tree is on the left in size 4 font. This is an example of “over fitting” the tree. There aren't enough data points to justify so many branches. This makes some of the branches pointless, misleading and obsolete.
More results
Most decisive factor in determining college major focus: time in Sys Lab
Most decisive factor in determining college classroom preparedness: whether or not higher math was taken.
College Life? Sports
Entropy
perhaps the most obscure part of Decision Trees. Entropy is used to find the best classifier by means of information gain.
Entropy(S) = - pplog2 pp – pnlog2 pn
pp is the proportion of positive examples while pn is the proportion of negative examples
Bad stuff?
No two people can directly contradict themselves.
The odds of that happening aren't too bad, so the larger the data set the more likely to crash the program
Even with 240 data points the tree is still very specific and it is easy to see where individuals stand out.
More valuable for broad “decision factors”
ROCK ON