evolutionary dynamics, game theory, and psychology

Post on 29-Mar-2015

238 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Evolutionary Dynamics, Game Theory, and Psychology

Today and next Tues:

we will show you some work we are doing at PED-that uses

evolutionary dynamics game theory experiments

-to understand human psychology, e.g. Altruistic motivesSense of beautyWhy we are “principled”

Will start off “philosophical” (in order to set the stage)

Then will present ongoing research…

This is work done at PED, with Erez, Oliver, Carl, Alex, Matthiajs, Martin …

Topic 1: Charitable Giving

We give A LOT ~2% of GDP are donated to charity~ 2-4% of hours worked are volunteered

But we don’t always give in the most effective way…

Habitat for Humanity: college educated 19 year oldswho’ve never held a hammerfly halfway across the world

To build new homes in places where there is plenty of cheap, qualified labor!

Is this just because we don’t realize how ineffective Habitat for Humanity is?

Or do we not care how effective we are?

IF we ACTUALLY care about being effective…we should give more when our gifts are matched, no?

And even more so our gifts are tripled, no?

Do we?

Some economists ran a study where they collected $ for charity…

And they manipulated the “matching rate”

What impact did the matching rate have?

None…

We ask…Why don’t we care about effectiveness?How can we make giving more effective?

More generally…Why are our altruistic preferences so funny? Can we characterize our altruistic preferences? And can we use this knowledge to increase

giving? Or to make giving more impactful?

Topic 2: Beauty

A few facts about beauty…

Fact 1: varies by culture

e.g., let’s learn about the ideal body weight in some indigenous populations in Nigeria...

Clearly, ideal body weight isn’t the same there as here.

Why not?

Random cultural variation?

Likewise, ideal skin tone varies by culture…

In Eastern countries, un-tanned skin is considered more attractive …

And likewise, in the West in the old days…

But nowadays in the West we seem to prefer tanned skin…

Why did our preferences change? Why do they differ between East and West?

Random fluctuations?

Even our ideal finger nail length varies by culture…

Here’s what finger nails look like on some men among the Khasi in N.E. India

Fact 2: We like art that is “authentic” (even if looks the same!)

Let me tell you about a study that illustrated this…

Researchers showed subjects two paintings

Some subjects were told the second painting was purposely designed by a different artist, making the second painting a forgery.

Some subjects were told the same artist painted both.

Within each group, half the subjects were told painting A was created first.

When the second painting was “created by the same artist,” it was rated higher

making the second painting a “replica.”

Source: Newman and bloom (2011)

Same artist Different artist

making the second painting a “replica.”

Why do we like our art to be “original”?

Perhaps this has nothing to do with aesthetics…

Well…when the study was repeated with an artifact (e.g. a car) instead of paintings, there was no effect…

Source: Newman and bloom (2011)

Topic 3: “Principles”

Why do we like people who are principled?

For instance, this statesmen in the West Wing who returns a card that can save his life...out of principle

And we admire him for it…even though turning down the card helps no one

In contrast, people who are strategic and calculating,

like this cop from “the wire” who prefer “better stats” to solving murders

…are repulsive. Even though his strategic actions don’t harm you…

Why do we like people who are principled/idealistic and dislike those who are strategic/calculated/Machiavellian, regardless of whether their actions help or harm us?

More generally…

Where do our preferences and ideologies come from?

In our research…

We try to understand from where such preferences and ideologies come from.

Using evolutionary dynamics + game theory + experiments

Quick review: what is game theory?

Game theory models behavior in any social interaction

Social interaction=my payoffs depend on what I do as well as what others do.

Let me illustrate using a simple game…

5, 6 8, 4

3, 2 0, -3

U

D

L RThe simplest “game” can be represented by the following “payoff matrix”

5,6 8, 4

3, 2 0, -3

U

D

L RPlayer 1 chooses between two actions

5, 6 8, 4

3, 2 0, -3

U

D

L RPlayer 2 simultaneously chooses between 2 actions

5, 6 8, 4

3, 2 0, -3

U

D

L R

The payoffs to player 1 are determined by her action as well as the action of player 2

5, 6 8, 4

3, 2 0, -3

U

D

L R

The payoffs to player 2 are determined by her action as well as the action of player 2

The main insight of game theory comes from the “Nash equilibria”

Which often have counterintuitive properties, or allows us to clarify things we already “know.”

5, 6 8, 4

3, 2 0, -3

U

D

L RThis game can be “solved” by finding the “Nash equilibrium”

5, 6 8, 4

3, 2 0, -3

U

D

L R(U, L) is a Nash Equilibrium b/c neither can benefit by unilaterally deviating

“Predictions” of game theory:

If both “expected” (U,L), both would play (U,L)!

(Nash is “self enforcing”)

5, 6 8, 4

3, 2 0, -3

U

D

L R(U,R) is NOT a Nash Equilibrium b/c 2 can benefit by unilaterally deviating

to L

Game theory “predicts”:

If both expected (U,R), player 2 would deviate!

(I.e. if not Nash, cannot be “stable”)

Nash makes sense (arguably) if…

-Uber-rational

-Calculating

Such as Auctions…

Or Oligopolies…

Nash also makes sense (as 1st approx.) if:

1) strategies that yield higher payoffs, reproduce faster

2) evolutionary dynamics Nash

e.g. Fisher and Sex Ratios

e.g. Maynard-Smith and territoriality

E.g. Zahavi’s handicap principle

But why would game theory matter for our preferences/ideologies?

We don’t “choose” what to find beautiful?

We didn’t “evolve” to like long finger nails?

But…

If preferences/ideologies learned/evolve…

Nash becomes relevant…

Here’s why:

Our thesis (in a few steps):

1) Reinforcement learning/or prestige biased imitation causes behaviors that do well to grow in frequency…

T=0 T=1

More successful behaviors imitated more

Prestige Biased

Imitation

T=0 T=1

Reinforcement Learning

More successful behaviors held more tenaciously

2) If one behavior is ALWAYS best…this will eventually lead to that behavior dominating…

T=0 T=1 T=2 T=3

3) If however, the behavior is a strategy in a game…strategies still become more frequent if they fair well…

But whether they do well could depend on what other’s are doing…so things can be a bit more complicated…nevertheless…

The dynamics often (Not always! Must Check!) settle on a NE

T=0 T=1 T=2 T=3

L

L

L

L L

L

L

L

R

RR

RR

R R

L

L

L

L L

L

L

L

R

RL

R R

LL

L

L

4) Suppose now that instead of choosing actions in a game…people merely act in accordance with their ideologies or preferences…

But ideologies and preferences ALSO can be (at least partially!) imitated or learned…

E.g.

PL is preferences that causes action L to be taken…

T=0 T=1

PL

PL

PL

PL PL

PL

PL

PL

PR

PR

PR

PR

PR

PR PR

PL

Feelings/beliefs that do better become more frequent

Then behavior will end up “at a NE”

T=0 T=1 T=2 T=3

PL

PL

PL

PL PL

PL

PL

PL

PR

PR

PR

PR

PR

PR PR

PL

PL

PL

PL PL

PL

PL

PL

PL

PR

PL

PR PR

PLPL

PL

PL

Then behavior will end up “at a NE”…

Notice:

-At NE with respect to the original payoffs, not the learned preferences!)-IF dynamicsnash!)-no need to be aware of “why” we have these preferences/ideologies(philosophers can spin their wheels on how these ideologies are “right”)-IF dynamicsnash under weak selection/high mutation, then don’t even need to believe preferences/ideologies are THAT responsive to payoffs…just a little bit…

And preferences/ideologies will have all sorts of (predictable) quirks that Nash has…-e.g. might like wasteful displays, if preferred partners can display at lower cost…-e.g. might desire to cooperate but not care about effectiveness...

OK…

What can this approach teach us?

What work needs to be done?

Some game theory modeling, demonstrating the quirky preferences/ideologies consistent with NE

Some evolutionary dynamics, demonstrating the NE emerges from dynamic processes.

Some experiments, testing the predictions of the models.

We will present examples of each:

1) Evolutionary dynamics costly signals (Which can explain why we evolved/learned to like authentic art…or why Indians like long nails)

2) Game Theory Model + Evolutionary Dynamics of “cooperate without looking”

(which can explain why we like those who are principled. And also why love blinds us…and why we find markets for kidneys gross…And also leads to valuable prescriptions!)

3) Experiment demonstrating that people give more efficiently when efficiency “commonly known”

(which is consistent with the ED+GT model of why we give inefficiently…and also leads to valuable prescription!)

Project 1: Experiment on (In)efficient Giving

-(preliminary) experiment demonstrating that people give more efficiently when efficiency “commonly known” -(brief) discussion of ED+GT model

Here’s the idea …

If preferences/ideologies that motivate us to give evolved/learned because of “reciprocity” or “partner choice”…

Then private information about effectiveness cannot matter

Why? Because others don’t know that information so can’t reciprocate, punish, match etc. based on it

Ran a simple MTurk experiment showing that people give more effectively not only when they know efficiency, but when efficiency is commonly known

Design:

Ask participants to distribute donations across a list of similar charities (in one of four categories)

Subjects were told their contributions would be observed by a third party

We obtained ratings for charities by scraping

www.charitynavigator.org

Three treatments:

No Ratings: subjects provided no information about charity effectiveness

Private Ratings: subjects given ratings by an external rating source, but told 3rd party would not be given ratings

Public Ratings: subjects given ratings and told 3rd party would also be given ratings

Private Information – Condition

Public Information – Condition

Note: can explain why we are not impacted by-matching -or scope-and give to inefficient charities like habitat for humanity…

EVEN IF we knew (in)efficiency, since efficiency not commonly known!

Note: NOT just useless theorizing…leads to valuable prescription:

To “nudge” the most impact out of existing prosocial motives, need to make information about effectiveness not just known but ALSO commonly known.

Project 2: Evolutionary Dynamics of Costly Signals

Recall…

First let’s argue that long fingernails yield behavior consistent with Nash in a costly signaling model

(and also ideal weight, skin tans, and authentic art)

Recall Zahavi’s explanation for the peacock tail

Tail is costly for all, more costly for unfit males. So is NE where females more likely to mateWith males with long tails, and only fit males find it worth the extra mating to grow long tail

Similarly for long nails…

Long nails costly for all (tail hinders flight)

More costly for farmers than teachers (more hindrance for unfit males)

Females prefer mating with teachers (peahens prefer fit males)

This, we will argue is WHY Khasi females find long nails beautiful

(we will later discuss how the same model can explain our other puzzles about beauty)

However, for this explanation to work, we need to be confident this Nash emerges in a dynamic model.

(No one chooses what to find beautiful, they simply “learn,” via RL or PBI)

We -created stylized costly signaling model, in which there are costly signaling equilibria (“separating equilibria” … as well as other less interesting equilibria)-We investigate various dynamics and find conditions under which separating equilibrium emerges.

Here is the stylized model…

1

High

Low

P

1

S0

S1

S2

S3

2

Accept

Reject

1

High

Low

P

1

High

Low

P

1

2

1

High

Low

P

1

S0

S1

S2

S3

1

2

Sn< Sn+1

Sn<<< Sn+1 if low

1

High

Low

P

1

S0

S1

S2

S3

2

1

2

Sn< Sn+1

Sn<<< Sn+1 if low

1

2 2

1

High

Low

P

1

S0

S1

S2

S3

2

Accept

Reject

Sn< Sn+1

Sn<<< Sn+1 if low

1

2 2

1

High

Low

P

1

S0

S1

S2

S3

2

Accept

Reject

Sn< Sn+1

Sn<<< Sn+1 if low

e.g. 0,3,6,9 and 0,1,2,3e.g. 5,5,-5

e.g. P=1/3

Examples of Strategy Profiles and Payoffs

<s0, s1, {s2, s3}>

(0,-1,0)

<s3, s1, {s3}>

(-1,-1,-10/3)

Nash Equilibrium = < , , > s.t. none benefit by unilaterally deviating

<s0, s2, {s2, s3}>

<s0, s3, {s3}>

<s0, s0, {}>

<s0, s1, {s1, s2 , s3}>

We will investigate the dynamics

E.g. Moran

s2

s2

s3

s0

s3

High 1’s

s0

s1

s1s1

s0

s1

s0

Low 1’s

{s2,s3}

{s2,s3}

2’s

{s2,s3}

{s2,s3}

e.g.NL=100 NH=100 N2=150

{s2,s3}

{s2,s3}

{s2,s3}

s0

s1

s1s1

s0

s1

s0

Low 1’s

-3-3

-3 -3

0

0

0

s0

s1

s1s1

s0

s1

s0

1-w+w(payoffs) e.g. w=.1

.6.6

.6 .6

.9

.9

.9

s0

s1

s1s1

s1

s1

s0

.

s0

s1

s1s1

s3

s1

s0

.

With probably μ choose random strategy

<s0, s2, {s2, s3}>

<s0, s0, {}>

<s0, s3, {s3}>

mu w

Efficient separating!

And if we aggregate across time, and many simulation runs?

X

XX

Why is this equilibrium emerging? Because it is hard to get out of, compared to the other equilibrium.

As soon as receiver drifts to accepting 2 or 3

Enough receivers must have “neutrally drifted” to accept 1 so worth for good but not bad types

Since good but not bad sending 1, receivers start accepting 1, to point where bad start sending

Very quicklyAfter bad start Sending 1, receivers stop Accepting 1

If in meantimeReceivers stopAccepting 2(by drift), thenBoth good and Bad better Sending 0

As soon as receiver drifts to accepting 1 or 2

Is this result robust?

1) payoffs 2) noise3) experimentation rate4) reinforcement learning

Reinforcement Learning

Even works for super high experimentation rates!

Does depend on interesting new condition:

Do females prefer to pair with random male?

P=1/2<s0, s0, {}>

<s0, s0, {s0}>

<s0, s2, {s2, s3}>

<s0, s3, {s3}>

No longer easy to leave pooling!

How can we interpret this condition?

What if there aren’t that many farmers…e.g. in the U.S.?

won’t be attracted to long finger nails!

Similarly if, signals not costly (art that is replica?)

Or if more costly for high type (sun exposure in U.S. today vs past vs China?)

Conclusion:

Even though don’t “choose what’s beautiful”

Some aspects can still be explained by costly signaling…

Provided (ever so slightly) more likely to imitate successful people’s notion of beauty, or (even if just a tiny bit ) more likely to adhere to notions of beauty when lead to nice outcomes…

But for this conclusion…

We needed to show how evolutionary dynamics work in costly signaling games

Project 3: Game Theory model + Evo Dynamics of “Cooperating without Looking”

(recall: principled vs. strategic)

Suppose a friend asks you to proofread a paper…

You hesitate while thinking about how big a pain it is and say, “Hmm. Um. Well, OK.”

You get less credit than if you agreed w/o hesitation

Colleague asks you to attend his talk.

You ask, “will this benefit my research?” before agreeing to attend.

You get less credit than if you agreed without asking

Why do you get less credit for cooperating when you deliberate (“look”)?

Note: cannot be explained by existing models of repeated games, like repeated prisoner’s dilemma

(In such models, players can only attend to your past actions not “deliberation” process)

Intuitively…

Cooperators who don’t look (cwol) can be trusted to cooperate even when the temptation to defect is high

But how do we know this added trust is worth the cost of losing out on missed opportunities to defect?

We will…1) Describe a simple model, “the envelope game”

2) Find (natural, intuitive) conditions under which CWOL is an equilibrium of this game

3) Show that even if agents are not consciously choosing their strategies but instead strategies are learned or evolved (replicator dynamic), cwol still emerges (i.e. has a sizeable basin of attraction)

4) Interpret these results in terms of some less straightforward social applications, such as why we:

1) like politicians who appear principled2) shun taboo tradeoffs3) are blinded by love

For which… Our equilibrium condition will yield novel predictionsThis analysis will also lead to some useful prescriptions

Here is our model…

“The Envelope Game”

First…

We model variation in costs of cooperation as follows:

• With probability p, Low Temptation “card” is chosen and stuffed in envelope

• With probability 1-p, High Temptation is chosen

Second…

• We model player 1’s choice of whether to “look”

• 1 chooses whether or not to open the envelope

Crucially we assume others (player 2) can observe whether the envelope was opened

2

2 Third…

1 then chooses whether or not toCooperate

2 is again able to observe

Fourth…

We model others’ “trust” in Player 1

Player 2 chooses whether to continue the interaction or exit

(If he continues, the game repeats, with future payoffs discounted by w)

We assume the payoffs have the following properties:

1) Cooperation is costly for Player 1, especially when the temptation is high

2) Both players like cooperative interactions, but Player 2 would prefer no interaction to one in which Player 1 sometimes defects

a, b a, b

cH, d cL, d

C

D

High Temptation Low Temptation

We represent this using the following variables:

a, b a, bcH, d cL, d

CD

High Temptation Low Temptation

Our assumptions then amount to: 1) cH >cL > a> 0

2) b*p + d*(1–p) < 0 < b

Result 1:

1 cooperates without looking (CWOL)2 continues iff 1 CWOL

is an equilibrium, provided: a/(1-w) > cLp + cH(1-p)

Intuition:

If 1 deviates to look, might as well defect, in which case expect c1p + c2(1-p) today and 0 ever after

If CWOL, get a today and henceforth, i.e. a/(1-w)

CWOL is an equilibrium iff a/(1-w) > cLp + cH(1-p)

Interpretation:

CWOL is an equilibrium iff EXPECTED gains from defecting today are less than the value of maintaining a cooperative interaction

Let’s contrast this with equilibrium conditions for “cooperate with looking” (CWL) to see when looking matters

Now player 1 may be tempted to deviate when she knows the temptation is high, in which case she would get cH

So we need a/(1-w) > cH

I.e. CWL is an equilibrium iff MAXIMAL gains from defecting today are less than the value of maintaining a cooperative interaction

Hence we predict “Looking” will matter when the expected gains from defecting are small but the maximal gains are large

Likewise, if we relax our assumption thatb*p+d*(1–p)<0<b, (but retain d<0<b)

We get another equilibrium where player 1 looks and defects when the temptation is high and player 2 exits iff 1 defects and the temptation is low

For looking to matter we ALSO need that defection is sufficiently bad for 2 that he doesn’t want to interact with 1s who even seldomly defect

What if strategies are not consciously calculated?

For instance, we might trust people based on a “gut feeling” or we might refuse to interact with people who disobey our ethics. Or we might just have a heuristic that tells us not to look

That’s where the main thesis of last class fits in!

We assume that feelings, ethics, heuristics that yield higher payoffs are more likely to be imitated (prestige biased imitation), reproduce (natural selection), or held tenaciously (reinforcement learning)

We will model this using the replicator dynamic, and show CWOL also emerges in the relevant parameter region

In the replicator dynamic…-an infinite population of each Player 1’s and player 2’s-at any point in time, each strategy has a certain frequency -payoffs are determined based on the expected opponent’s play, given this frequency-strategies reproduce proportional to their payoffs

Note: -Replicator requires few strategies, so we restrict to 7 that include all “important deviants”-Replicator cannot be solved analytically; we numerically estimate in computer simulation-We will need to classify strategies into those that are behaviorally equivalent

Restricted Strategy Space

Classifying Populations

Simulation

For each of many parameter values…

For each of 5,000 trials…

We seed the population with random mixtures of strategies

Numerically estimate the replicator dynamic (which is an ODE)

Wait for the population to stabilize

Then classify the outcomes (ignoring small errors)

We find…

Population ends up at CWOL fairly often in relevant parameter region

a*=cH/cLp + cH(1-p),)a**=cH/(1-w)

Now let’s discuss some social applications…

First application: Why do we like politicians who have “principles” and not those who are “strategic” (e.g. those who “flip flop”)?

(and more generally, why do we like those who are principled? And when will we care?)

We argue…

Someone who is “strategic” is likely to choose the policy that benefits himself when given the power

When will we care if others are strategic? Not if incentives sufficiently aligned that won’t ever be in a position where tempted to drastically harm us(i.e. b*p+d*(1–p)<0<b)

E.g., crucial that girlfriend/boyfriend is principled, but not so crucial that doubles partner is, because she has no occasions where tempted to really harm you

Second application: Taboo tradeoffs

I.e. unwillingness to tradeoff sacred (e.g. life), against mundane (e.g. money)

E.g. many find economists perverse for applying cost benefit to value of life

Taboo to CONSIDER tradeoff

We find such tradeoffs disgusting because…

Such tradeoffs signal a willingness to look at the benefits of defectingdisgust signals we wouldn’t look, or wouldn’t interact with looker

When will we find such tradeoffs disgusting?

Usually not worth transgressing, but sometimes very beneficial

i.e. cH>a/(1-w) > cLp + cH(1-p)

(And those transgressions harmful)

This has an important policy implication:

While politicians might want to signal that they would never trade lives for money

The gains from appearing trustworthy accrue to the politicians while costs accrue to us

We should force policymakers and lawyers to tackle these admittedly hard-to-fathom tradeoffs

Third application: love

Love has the property that it “blinds us” to the costs and benefits of doing good to our partner (e.g., don’t cheat regardless of opportunity)

Why does love have this property?

Because those in love can be trusted, so will make better long-term partners

Gives three new predictions about love

First…

Falling in or out of love depends on the distribution of temptations, but not their immediate realizations

That is, people may fall out of love when there is a permanent change in opportunities, but not an immediate temptation

Second…

Love comes with a cost–the cost of ignored temptations–and suggests that this cost must be compensated with commensurate investment in the relationship

Only sometimes is it worthwhile for the recipient of love to compensate a suitor for such missed opportunities, o/w will prefer suitor not fall in love

Third…

Just looking at the costs or benefits can hasten the demise of a relationship, even if don’t defect

Perhaps why partner may get upset if sees you “looking” even if never act on temptation

Or why get upset when partners suggest a pre-nup

General conclusion:

In our research…

We try to understand from where such preferences and ideologies come from.

Using evolutionary dynamics + game theory + experiments

Some game theory modeling, demonstrating the quirky preferences/ideologies consistent with NE

Some evolutionary dynamics, demonstrating the NE emerges from dynamic processes.

Some experiments, testing the predictions of the models.

We will present examples of each:

1) Evolutionary dynamics costly signals (Which can explain why we evolved/learned to like authentic art…or why Indians like long nails)

2) Game Theory Model + Evolutionary Dynamics of “cooperate without looking”

(which can explain why we like those who are principled. And also why love blinds us…and why we find markets for kidneys gross…And also leads to valuable prescriptions!)

3) Experiment demonstrating that people give more efficiently when efficiency “commonly known”

(which is consistent with the ED+GT model of why we give inefficiently…and also leads to valuable prescription!)

top related