atr presentation

Pain Avoidance Learning “Model-based” and “Model-free” systems

Oliver WangDepartment of Cognitive Neuroscience

Project Concepts

● Creation of a cognitive map of the space● Learning the values of actions as well as states

● Only learning the values of each action

Model-Based System Model-Free System

Project Basis● Neural Computations Underlying Arbitration between Model-based and Model-free Learning

by Sang Wan Lee, Shinsuke Shimojo, and John P. O’Doherty (2014)

● We hypothesized that a similar arbitration method might exist in aversion learning as it does in reward learning.

LiteratureMinimal research has been done on pain aversion learning

● Hendersen and Graham (Avoidance of Heat by Rats, 1979)● Prevost and O’Doherty (Pavlovian Aversive Learning, 2013)● Gillan and Robbins (Enhanced Avoidance Habits, 2014)

Task Design: Two-layer Markov Decision Task

● Training session followed by 2 sessions, each with 48 blocks, each with on average 5 trials ● Sequential 2-choices (L/R) to final state● Following states are determined by the choice and the probabilities of each branch at that time● 4 block conditions:

o Flexible, high uncertaintyo Flexible, low uncertaintyo Specific, high uncertaintyo Specific, low uncertainty

Block ConditionsFlexible Final state values are set and you receive the number of shocks indicatedEncourages a Model-Free strategy

SpecificBin color must match final state color to receive the number of shocks indicated. Otherwise you receive 4, the maximum number, of shocks.Encourages a Model-Based strategy

Example of 1 trial, flexible condition, high uncertainty

Probability: UncertaintyHigh uncertainty vs. Low uncertainty

1. High uncertainty refers to a (.5,.5) chance between the 2 resulting states.

2. Low uncertainty refers to a (.9,.1) chance, and thus a state (left state in the diagram to the right) is much more highly favorable.

*Uncertainty is maintained throughout each block

Behavioral Results

Participants: 16 subjects

Behavioral Results con’tObservation● Significantly higher proportion

observed in flexible, low uncertainty condition

Conclusion● Some difference in the arbitrator

must exist

Model-free Simulation

16 Simulated Subjects Alpha = .03, Beta = 1

MF Learning is able to replicate only the results of the flexible condition

Subject Choices: MB/MFLeft

● Refers to the flexible condition

● MB system is not adequate

Right● Refers to the specific

condition● MB system is

adequate

Parameters

Observation● Parameter for “learning rate for the estimate of absolute reward prediction error” is much

greater in our pain aversion task

Interpretation● Suggests a more dynamic arbitration system exists

*Pain Aversion Parameters (left bar) and Reward Based Parameters (right bar) (Lee)

Conclusion1. Both Model-free and Model-based systems

exist in aversion learning.

2. Although the arbitration process between the two systems share many similarities to reward-based learning, there exists subtle differences between the two.

Next Steps1. fMRI 2. Modeling

ありがとうございますThank you for listening.

atr presentation

Documents