suny at albany system dynamics colloquium, spring 2008 navid ghaffarzadegan effect of conditional...

SUNY at AlbanySystem Dynamics Colloquium, Spring 2008

Navid Ghaffarzadegan

Effect of Conditional Feedback

on LearningNavid GhaffarzadeganPhD Student, the State University of New York at Albany

MIT-Albany-WPI System Dynamics Colloquium, Spring 2008


System Dynamics Colloquium, Spring 2008

SUNY at Albany

Introduction

barriers to learning from feedback in a dynamic decision making environment: Complexity of the environment (Gonzalez 2005) Misperception of delays (Rahmandad et al. 2007, Rahmandad

2008) Feedback asymmetry (Denrell and March 2001), The existence of noise in feedback (Bereby-Meyer and Roth

2006), Problems of mental models (Senge 1996) …

People ignore and misperceive feedback (Sterman 1989a, Sterman 1989b).



SUNY at Albany

Introduction

A common theme in formal studies on learning a decision maker makes a decision and receives a

payoff the question is whether or not the decision maker is

capable of learning from the information. Full Feedback

Decision Payoff Perceiving payoff

LEARNING



SUNY at Albany

Introduction

Little attention has been paid to the relevance of such an assumption. eg: Police Officer, Admission Office, Human Resources Manager

Conditional Feedback For positive decisions we perceive feedback much easier than

for negative decisions

Decision Payoff Perceiving payoff

LEARNING



SUNY at Albany

Research Problem

What is the effect of conditional feedback on learning Or How relevant was the assumption of “Full Feedback”?

Method: 1- Simulation.

1- Build a differential equation model in Signal Detection Framework

2- Experiment with the model 2- Test with data

Second hand data: a published laboratory experiment



SUNY at Albany

Framework

Signal Detection Theory Signal vs. Noise

e.g. Guilty vs. Innocent, e.g. capable vs. incapable candidates

Decision makers try to differentiate signals from noise Judgment and Decision Making

Evidence is often ambiguous, and there is uncertainty in the environment (Hammond 1996, Stewart 2000)



SUNY at Albany

Framework

Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

(e.g. guilty persons)

Noise distribution

(e.g. innocent persons) Threshold

Judgment

d'

Important concepts: Base rate – selection rate – d’ – threshold Payoff Threshold Learning



SUNY at Albany

Framework

Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

(e.g. guilty persons)

Noise distribution

(e.g. innocent persons) Threshold

Judgment

d'

Conditional Feedback Threshold Learning (Cue Learning)



SUNY at Albany

Model I: Full Feedback

Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

Noise distribution

T1 T2

T3

Threshold

Threshold

payoff

ThresholdLearning

payoffperception

Judgment

Set threshold make decision Receive Payoff Perceive Payoff correct threshold

One stock: Threshold (experiment)



SUNY at Albany


Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

Noise distribution

T1 T2

T3

Threshold

Threshold

payoff

ThresholdLearning

payoffperception

Judgment

Learning Algorithm: 1. Learning from payoff shortfall:

Payoff shortfall= maximum possible payoff (Q) – payoff (Q, d) maximum possible payoff (Q)= Vtn+ Q*(Vtp-Vtn)

2. Anchoring and adjustment assumption in correcting the threshold (Tversky and Kahneman 1974, Epley and Gilovich 2001, Sterman 1989.b)



SUNY at Albany


Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

Noise distribution

T1 T2

T3

Threshold

Threshold

payoff

ThresholdLearning

payoffperception

Judgment

Inputs: noise ~ N(0,1) and signal ~ N(d’,1)

Base rate = 0.5, values are symmetric, (To make Base rate traceable)correct decisions are more valued



SUNY at Albany

Model I: Full Feedback - Results

6.5

3.5

0.5

-2.5

-5.5

0 100 200 300 400 500 600 700 800 900 1000

threshold

1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950Time (Minute)

avrage base rate : Base rate = 05average selection rate : Base rate = 05

Optimal threshold

threshold

Average base rate = 0.5

Average Selection rate in last 50 trials

In full feedback; the model is able to learn from feedback



SUNY at Albany


In full feedback; the model is able to learn from feedback Looking at payoff shortfall in enough to learn threshold The speed of approaching depends on the time to change

threshold

1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950Time (Minute)


1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950Time (Minute)


Average Selection rate in last 50

trials

Dynamics of selection rate for base rates of 0.3 and 0.7



SUNY at Albany

Model II: Conditional Feedback

Our decision influence our payoff perception.

How do we judge our negative decision's payoff.

People can be different in interpreting their negative decisions. (Personality, second loop learning,..)

Threshold

payoff

ThresholdLearning

feedbackavailability

payoffperception

Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

Noise distribution

Threshold

Judgment

Clear feedback Unclear (NO) feedback



SUNY at Albany

Model II: Conditional Feedback

Constructivist strategyFor negative decisions: perceived payoff= payoff (p,0)

p = ratio of signals to total decisions for negative decisionsp=0 means assuming all of our negative decisions are right.p=1 means assuming all of our negative decisions are wrong.

Threshold

payoff

ThresholdLearning


payoffperception

Judgment

Pro

ba

bilit

y d

en

sit

y

Signal distribution

Noise distribution

Threshold

Judgment

Clear feedback Unclear (NO) feedback P



SUNY at Albany

Model II: Conditional Feedback - Results

In conditional feedback; learning depends on how people code their negative decisions

Elwin et. al (2007): p=0 Stewart et. al (2007): people underestimate selection

rate and overestimate threshold

current

average selection rate1

0.75

0.5

0.25

00 249.97 499.95 749.92 999.90

Time (Minute)

Base rate

Final selection rate interval

current

optimal threshold2

1

0

-1

-20 249.97 499.95 749.92 999.90

Time (Minute)

Optimal threshold

Final threshold interval

P=1

P=0

P=1

P=0



SUNY at Albany


Comparison of full feedback and conditional feedback in confident constructivist strategy

Dynamics of selection rate in last 50 trials (a) and threshold (b) for base rate of 0.5

6.5

3.5

0.5

-2.5

-5.5

0 100 200 300 400 500 600 700 800 900 1000

1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950

Optimal threshold


full feedback

conditional feedback

full feedback




SUNY at Albany


In conditional feedback and in confident constructivist strategy; the model is not able to learn from feedback

What is the real p? How do really people code their negative decisions?

Dynamics of selection rate for base rates of 0.3 and 0.7

1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950

1

0.75

0.5

0.25

0

50 140 230 320 410 500 590 680 770 860 950


full feedback



full feedback




SUNY at Albany

Replications of an empirical investigation

Data from Elwin et. al (2007) Comparison of Full Feedback and Conditional

Feedback Sixty four subjects performed a computerized task

of predicting economic outcomes for companies The experiment had two major phases:

training trials test phase

In the training part, a group of subjects performed 120 trials of full feedback decision making, while the other group performed 240 trials of conditional feedback.



SUNY at Albany

Replications of an empirical investigation

We use their published report in our model and test parameters that can replicate their findings.

Main parameters: d’ and p. (and time to adjust threshold)

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5

Confidence in negative decisions (1- p)

Level of expertise (d') High

High

Low

Low

0.36<SR<0.5

Replication area 0.3 ≤SR ≤0.36

over confidence

SR<0.3

0.5<SR

SR=0.5

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5

Confidence in negative decisions (1- p)

Level of expertise (d') High

High

Low

Low

Replication area 0.3 ≤SR ≤0.36

SR <0.3

SR=0.5

0.36<SR<0.5

over confidence

0.5<SR



SUNY at Albany

Conclusion

A new explanation for imperfectness of decision making in a series of tasks. (learning from clear shortfalls)

Conditional feedback can result in bias and underestimation of the base rate.

In respect to second loop learning: People do not find the optimal threshold even if, in the real world, second loop learning exists, it works for

a limited number of people

Warning about overestimation of relevance of full feedback assumption in formal studies.



SUNY at Albany

Conclusion

Future works: Effects of personality traits

Threshold

payoff

ThresholdLearning


payoffperception

P

Effect of Personality on Learning, e.g. using Big Five



SUNY at Albany

Conclusion

Future works: Making confidence endogenous.

Effect of Personality on Learning, e.g. using Big Five

Dynamics of confidence building

Threshold

payoff

ThresholdLearning


payoffperception

P

Confidence



SUNY at Albany

Conclusion

Future works: Two individuals communicating

Threshold

payoff

ThresholdLearning


payoffperception

P

Threshold 1

payoff 1

feedbackavailability 1

payoffperception 1

P1

ThresholdLearning



SUNY at Albany

Conclusion

Future works: Two individuals influencing each others performance

Threshold

payoff

ThresholdLearning


payoffperception

P

Threshold 1

payoff 1

feedbackavailability 1

payoffperception 1

P1

ThresholdLearning



SUNY at Albany

Thanks

FEEDBACK?

suny at albany system dynamics colloquium, spring 2008 navid ghaffarzadegan effect of conditional...

Documents

feedback slide

wpi system dynamics

feedback noise

albany mitalbany

feedback sterman

feedback results

feedback learning algorithm

feedback inputs