introducing psychometric ai v2 4.6.02 -- not for circulation selmer bringsjord & bettina...

40
Introducing Psychometric AI v2 4.6.02 -- not for circulation Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180 Selmer would like to express his deep gratitude to ETS, b/c w/o its support of the eWriter Project, the new form of AI proposed/described herein might well never have occurred to him.

Upload: will-bolton

Post on 14-Dec-2015

225 views

Category:

Documents


3 download

TRANSCRIPT

Introducing Psychometric AIv2 4.6.02 -- not for circulation

Selmer Bringsjord & Bettina Schimanski & …?Department of Cognitive Science

Department of Computer Science

RPI

Troy NY 12180

Selmer would like to express his deepgratitude to ETS, b/c w/o its support ofthe eWriter Project, the new form of AIproposed/described herein might wellnever have occurred to him.

Roots of this R&D…

Seeking to Impact a # of Fields• This work weaves together

relevant parts of:– Artificial Intelligence: Build machine

agents to “crack” and create tests.– Psychology: Use experimental

methods to uncover nature of human reasoning used to solve test items.

– Philosophy: Address fundamental “big” questions, e.g., What is intelligence? Would a machine able to excel on certain tests be brilliant?…

– Education: Discover the nature of tests used to make decisions about how students are taught what, when.

– Linguistics: Reduce reasoning in natural language to computation.

Many applications!

The Primacy of Psychology of ReasoningThere is consensus among the relevant luminaries in AI and theorem provingand psychology of reasoning and cognitive modeling that: machinereasoning stands to the best of human reasoning as a rodent stands to thelikes of Kurt Godel. In the summer before Herb Simon died, in apresentation at CMU, he essentially acknowledged this fact -- and set outto change the situation by building a machine reasoner with the power offirst-rate human reasoners (e.g., professional logicians). Unfortunately,Simon passed away. Now, the only way to fight toward his dream (which ofcourse many others before him expressed) is to affirm the primacy ofpsychology of reasoning. Otherwise we will end up building systems thatare anemic. The fact is that first-rate human reasoners use techniquesthat haven't found their way into machine systems. E.g., humans useextremely complicated, temporally extended mental images and associatedemotions to reason. No machine, no theorem prover, no cognitivearchitecture, uses such a thing. The situation is different than chess --radically so. In chess, we knew that brute force could eventually beathumans. In reasoning, brute force shows no signs of exceeding humanreasoning. Therefore, unlike the case of chess, in reasoning we are goingto have to stay with the attempt to understand and replicate in machineterms what the best human reasoners do. We submit that a machine able toprove that the key in an LR/RC problem is the key, and that the otheroptions are incorrect, is an excellent point to aim for, perhapsthe best that there is. As a starting place, we can turn to simpler tests.

“Chess isTooEasy”

Multi-Agent Reasoning, modeled inMental Metalogic, is the keyto reaching Simon’s Dream!Pilot experiment shows that groupsof reasoners instantly surmountthe errors known to plague individualreasoners!Come Wed 2.27.02 12n SA3205

What is Psychometric AI?

A New Kind of AI• Assume the ‘A’ part isn’t the problem: we know

what an artifact is.• Psychometric AI offers a simple but radical answer:

– Psychometric AI is the field devoted to building information-processing entities (some of which will be robots) capable of at least solid performance on all established, validated tests of intelligence and mental ability, a class of tests that includes IQ tests, tests of reasoning, of creativity, mechanical ability, and so on.

• Don’t confuse this with: “Some human is intelligent…”

• Psychologists don’t agree on what human intelligence is.– Two notorious conferences. See The g Factor.

• But we can agree that one great success story of psychology is testing, and prediction on the basis of it. (The Big Test)

AI is the field devoted to building intelligent artificial agents, i.e., agents capable of solid performance on intelligence tests.

An Answer to: What is AI?

Therefore…

Some of the tests…

Intelligence Tests: Narrow vs. BroadSpearman’sview of intelligence

Thurstone’s view ofintelligence

Let’s look @ RPM(AI-based replication of Carpenter et al)

(Sample 1)

RPM Sample 2

RPM Sample 3

Artificial Agent to Crack RPM

---------------- PROOF ----------------1 [] a33!=a31.3 [] -R3(x)| -T(x)|x=y| -R3(y)| -T(y).16 [] R3(a31).24 [] T(a31).30 [] R3(a33).31 [] T(a33).122 [hyper,31,3,16,24,30,flip.1] a33=a31.124 [binary,122.1,1.1] $F.------------ end of proof -------------

----------- times (seconds) -----------user CPU time 0.62 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM

---------------- PROOF ----------------1 [] a33!=a31.7 [] -R3(x)| -StripedBar(x)|x=y| -R3(y)| -StripedBar(y).16 [] R3(a31).25 [] StripedBar(a31).30 [] R3(a33).32 [] StripedBar(a33).128 [hyper,32,7,16,25,30,flip.1] a33=a31.130 [binary,128.1,1.1] $F.------------ end of proof -------------

----------- times (seconds) -----------user CPU time 0.17 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM

=========== start of search ===========given clause #1: (wt=2) 10 [] R1(a11).given clause #2: (wt=2) 11 [] R1(a12).given clause #3: (wt=2) 12 [] R1(a13)....given clause #4: (wt=2) 13 [] R2(a21).given clause #278: (wt=16) 287 [para_into,64.3.1,3.3.1] R2(x)| -R3(a23)| -EmptyBar(y)| -R3(x)| -EmptyBar(x)| -T(a23)| -R3(y)| -T(y).given clause #279: (wt=16) 288 [para_into,65.3.1,8.3.1] R2(x)| -R3(a23)| -StripedBar(y)| -R3(x)| -StripedBar(x)| -EmptyBar(a23)| -R3(y)|-EmptyBar(y).Search stopped by max_seconds option.============ end of search ============

Correct!

Possible Objection

“If one were offered a machine purported to be intelligent, what would be an appropriate method of evaluating this claim? The most obvious approach might be to give the machine an IQ test … However, [good performance on tasks seen in IQ tests would not] be completely satisfactory because the machine would have to be specially prepared for any specific task that it was asked to perform. The task could not be described to the machine in a normal conversation (verbal or written) if the specific nature of the task was not already programmed into the machine. Such considerations led many people to believe that the ability to communicate freely using some form of natural language is an essential attribute of an intelligent entity.” (Fischler & Firschein 1990, p. 12)

WAISA Broad Intelligence Test…

Cube Assembly

Basic Setup

Problem: Solution:

Harder Cube Assembly

Basic Setup

Problem: Solution:

The robot in Selmer’s lab that will be able to excel on the WAIS and other tests. We don’t yet have a name for our artificial master of tests. MIT has COG. What should the name be? Suggestions are welcome! Send to [email protected].

Picture Completion

Picture Completion

Currently untouchable AI -- but we shall see.

And ETS’ tests…

The “Lobster”Lobsters usually develop one smaller, cutter claw and one larger,crusher claw. To show that exercise determines which claw becomesthe crusher, researchers placed young lobsters in tanks and repeatedlyprompted them to grab a probe with one claw – in each case alwaysthe same, randomly selected claw. In most of the lobsters the grabbingclaw became the crusher. But in a second, similar experiment, whenlobsters were prompted to use both claws equally for grabbing, mostmatured with two cutter claws, even though each claw was exercisedas much as the grabbing claws had been in the first experiment.

Which of the following is best supported by the information above?

A Young lobsters usually exercise one claw more than the other.B Most lobsters raised in captivity will not develop a crusher clawC Exercise is not a determining factor in the development of crusher claws in lobsters.D Cutter claws are more effective for grabbing than are crusher claws.E Young lobsters that do not exercise either claw will nevertheless usually develop one crusher and one cutter claw.

Same Approach Used

---------------- PROOF ----------------1 [] -Lobster(x)|Cutter(r(x)).3 [] -Lobster(x)| -Exercise(r(x))| -Exercise(l(x))|Cutter(l(x)).4 [] -Lobster(x)| -Cutter(r(x))| -Cutter(l(x)).5 [] Lobster($c1).6 [] Exercise(r($c1)).7 [] Exercise(l($c1)).9 [hyper,5,1] Cutter(r($c1)).10 [hyper,7,3,5,6] Cutter(l($c1)).11 [hyper,10,4,5,9] $F.------------ end of proof -------------

----------- times (seconds) -----------user CPU time 0.38 (0 hr, 0 min, 0 sec)

Therefore option AIs correct!

Underlying Math

…explained by hand as it’sa bit intricate…

More Careful Look

in : x (L(x) ((C(l(x)) R(r(x))) (C(r(x)) R(l(x)))))

in : x (L(x) ((C(l(x)) R(r(x)) L(r(x),l(x))) (C(r(x)) R(l(x)) L(l(x),r(x)))))

Comments on RC Items…

Many critics of Emily Bronte’s novel Wurthering Heights see its second partas a counterpoint that comments on, if it does not reverse, the first part, wherea “romantic” reading receives more confirmation. Seeing the two parts as a wholeis encouraged by the novel’s sophisticated structure, revealed in its complexuse of narrators and time shifts. Granted that the presence of these elementsneed not argue an authorial awareness of novelistic construction comparableto that of Henry James, their presence does encourage attempts to unify thenovel’s heterogeneous parts. However, any interpretation that seeks to unifyall of the novel’s diverse elements is bound to be somewhat unconvincing. Thisis not because such an interpretation necessarily stiffens into a thesis (althoughrigidity in an interpretation of this or of any novel is always a danger), butbecause Wurthering Heights has recalcitrant elements of undeniable power that,ultimately, resist inclusion in an all-encompassing interpretation. In this respect,Wuthering Heights shares a feature of Hamlet.

“Wuthering Heights”…Many critics of Emily Bronte’s novel Wurthering Heights see its second partas a counterpoint that comments on, if it does not reverse, the first part, wherea “romantic” reading receives more confirmation. Seeing the two parts as a wholeis encouraged by the novel’s sophisticated structure, revealed in its complexuse of narrators and time shifts. Granted that the presence of these elementsneed not argue an authorial awareness of novelistic construction comparableto that of Henry James, their presence does encourage attempts to unify thenovel’s heterogeneous parts. However, any interpretation that seeks to unifyall of the novel’s diverse elements is bound to be somewhat unconvincing. Thisis not because such an interpretation necessarily stiffens into a thesis (althoughrigidity in an interpretation of this or of any novel is always a danger), butbecause Wurthering Heights has recalcitrant elements of undeniable power that,ultimately, resist inclusion in an all-encompassing interpretation. In this respect,Wuthering Heights shares a feature of Hamlet.

According to the passage, which of the following is a true statement about theFirst and second parts of Wurthering Heights?

(A) The second part has received more attention from critics....

Additional Objections…

PAI is too idiosyncratic!

• Actually, PAI can be viewed as a generalization of the Turing Test-based answer to “What is AI?”– AI is the field devoted to building artificial

agents capable of passing the Turing Test. (As affirmed in a number of texts.)

• PAI has the major advantage of requiring high performance on many different tests, all, unlike TT, grounded in psychology.

But your applications are only in Testing!

• No. An agent able to perform well on all these tests can do everything and then some.

• If we believe that psychology has, through tests, isolated, in gem-like fashion, what’s most important in cognition, then powerful agents in PAI will be powerful agents, period.

Psychometric AIin Context …

A Classic “Cognitive System” Setup Under Development

Cognitive System

Test Item

Choice of correctoption, and rulingout of others, and…“percept”

“action”

actions that involve physical manipulation of objects and locomotion.

Fits forthcomingSuperminds

book by Bringsjord & Zenzen…

• “Weak” AI based on testing going back to Turing is implied for the practice of AI.

Fits “Complete” CogSci…

Cognitive System

Environm

ent

Perception

Action

Perception and Action

Low-levelHigh-level

subdeclarative com

putation

Cognitive System

Environm

ent

Perception

Action

Cognitive Modeling

Short Term Memory

Long Term Memory

Perception& Action

Low-levelHigh-level

subdeclarative com

putation

AC

T-

R

Cognitive System

Environm

ent

Perception

Action

Reasoning (Bringsjord)

Short Term Memory

Long Term Memory

Perception& Action

Low-levelHigh-level

subdeclarative com

putation

AC

T-

R

Mental

Metalogic

SyntacticReasoning

SemanticReasoning

Cognitive System

Environm

ent

Perception

Action

Cognitive Human Factors:Engineering the Interface b/t Cognitive Systems and their Environments

Short Term Memory

Long Term Memory

Perception& Action

Low-levelHigh-level

subdeclarative com

putation

AC

T-

R

Mental

Metalogic

SyntacticReasoning

SemanticReasoning

Large Variation in Difficulty

Evan’sANALOGY

Program