blue evolution & learning in games econ 243b · the family g n(a,b) exhibits fast convergence...

Evolution & Learning in GamesEcon 243B

Jean-Paul Carvalho

Lecture 17.Fast Convergence

1 / 24

The Very Long Run?

Based on notes by H. Peyton Young

Kreindler & Young (2013 GEB, 2014 PNAS)

I In the long run, the stochastic dynamic spends almost all the time in thestochastically stable states.

I However, the expected waiting time to reach a stochastically stablestate grows exponentially as the error rate ε becomes arbitrarily small.

I Is SS only relevant in the very long run?

2 / 24

Intermediate Error Rates

In this lecture, we see that for intermediate values of ε there is:

I Sharp selection.

I Fast convergence.

These results hold when agents respond to:

(a) the distribution of actions in the whole population,

(b) random samples from the population,

(c) their neighbors in a network.

3 / 24

Model

Population size N, large but finite.

2× 2 symmetric pure coordination game:

A BA 1 + α

1 + α0

0B 0

01

1

I (B,B) is the status quo.

I (A,A) is the innovation, with α > 0 the payoff gain to thedeviation.

4 / 24

General Coordination Game

Can generalize to any 2× 2 symmetric coordination game:

A BA a

ad

cB c

db

b

with α redefined as the normalized potential difference:

α =(a− d)− (b− c)

b− c.

5 / 24

Strategy Revisions

I Time is discrete.

I Each period lasts τ = 1N units of time.

I Each period, one randomly chosen agent revises:

Step 1. Gathers information on current play,

Step. 2 Chooses a (myopic) noisy best response.

6 / 24

Logit Learning

Step 1. Agent forms estimate x of adoption rate of A.

(a) Full information: agent knows the current proportion ofadopters in the population.

(b) Partial information: agent randomly samples d otherplayers.

Step 2. Noisy best response given by logit function:

Pr(Choose A|x) = eβ(1+α)x

eβ(1+α)x + eβ(1−x).

Error rate (at zero adoption) is ε = 11+eβ .

7 / 24

Logit Learning: α = 0.5, β = 3.5Logit Learning: a = 0.50

8 / 24

Logit Learning: α = 0.5, various βLogit Learning

9 / 24

Convergence Times

I Study process ΓN(α, β) starting from all- B state (statusquo).

I State variable: adoption rate x(t) of innovation A.

I Define waiting time to adoption level p < 1:

TN(α, β, p) = min{t : x(t) ≥ p}.

10 / 24

Sample Adoption Pathε = 5%, α = 100%, p = 99%, N = 1000

Sample adoption path

Time

11 / 24

Fast Convergence - definitions

Definition 1. The family ΓN(α, β) exhibits fast convergence if theexpected waiting time until a majority of agents play A isbounded independently of N, or

E TN(α, β, 1

2

)< S(α, β) for all N.

Definition 2. The family ΓN(α, β) exhibits fast convergence to p ifthe expected waiting time to adoption level p is boundedindependently of N, or

E TN (α, β, p) < S(α, β, p) for all N.

12 / 24

Fast Convergence - result

Theorem 16.1. Let

h(β) =eβ−1 + 4− e

β− 2 for β > 2

h(β) = 0 for 0 < β ≤ 2

If α > h(β), then ΓN(α, β) exhibits fast convergence.

13 / 24

Error Rate

Recall that the error rate ε is the probability of choosing A whenyou expect your opponent to choose B:

ε =e0

e0 + eβ.

I β = 2 means ε ≈ 12%.

I β = 3 means ε ≈ 5%.

14 / 24

Threshold for Fast Convergence

FAST

SLOW

Threshold for fast convergence

h(e) = estimate of boundary

a*(e)= actual boundary

15 / 24

Proof Strategy

Pr(Choose A|x) = f (x; α, β) =eβ(1+α)x

eβ(1+α)x + eβ(1−x).

Recall that the continuous-time mean logit dynamic is thedifferential equation:

x = f (x; α, β)− x with x(0) = 0.

The logit equilibria are the fixed points:

f (x∗; α, β) = x∗.

The key is to find combinations of α, β such that the lowestfixed point is greater than 1/2.

16 / 24

Proof StrategyThe status quo equilibrium disappears when α is big enough.Status quo equilibrium disappears when a is big enough

17 / 24

Proof StrategyLet x0 be the tangency point. Then:

f ′(x0; α, β) = 1 (1)

andf (x0; α, β) = x0. (2)

Note that:

f ′(x0; α, β) = β(2 + α)f (x0) (1− f (x0)) = 1. (3)

Combining (1) and (3) yields

x0(1− x0) =1

β(2 + α).

18 / 24

Proof StrategyIf x0 is small, x2

0 is very small.

Henceβ(2 + α)x0 ≈ 1 (4)

f (x0) =eβ(1+α)x0

eβ(1+α)x0 + eβ(1−x0)

=1

1 + eβ−β(2+α)x0≈ 1

1 + eβ−1

≈ x0 ≈1

β(2 + α).

Therefore,β(2 + α) ≈ eβ−1 + 1.

This defines the approximate combinations of α and β required to liftf (x) off the 45-degree line.

In fact, we need β(2 + α) > eβ−1 + 4− e.19 / 24

Average Waiting Times

ε = 5%, p = 99%

N = 100 N = 1000 N = 10, 000α = 70% 33 101 > 8, 000α = 80% 25 36 35

ε = 10%, p = 50% (top row), p = 90% (bottom row)

N = 100 N = 1000 N = 10, 000α = 4% 38 190 > 8, 000

α = 25% 19 20 21

20 / 24

Partial Information

Analogous results hold when agents draw random samplesfrom the population:

I Sample size d < ∞ fixed independently of N.

I Updating function becomes:

fd(x; α, β) =d

∑k=0

(dk

)xk(1− x)d−k f

(kd ; α, β

).

Theorem 16.2. Assume d ≥ 3. If α > min{h(β), d− 2}, theprocess exhibits fast convergence.

21 / 24

Payoff HeterogeneityNow suppose agents choose exact best responses, but payoffsare perturbed by small shocks:

∆ is the payoff gain from choosing A:

∆ = (1 + α)x− (1− x).

Let εA and εB be i.i.d. (idiosyncratic) payoffs from playing Aand B. Then:

Pr(Choose A|x) = Pr(εA − εB + ∆ > 0).

I If εA and εB are extreme value distributed, this is logitchoice.

I For comparison, let us consider normally distributedpayoff shocks, εA, εB ∼ N(0, 1/β).

22 / 24

Fast Convergence

Response function (left) and fast diffusion threshold (right).Normally distributed payoff shocks (gray dots), and extremevalue (black line) for d = 15.

23 / 24

Conclusion

Evolutionary selection occurs within realistic time framesfor plausible levels of error and/or payoff heterogeneity.

For the role of networks in fast convergence, see:

I Morris (2000 ReStud),

I Kreindler & Young (2014 PNAS),

I Arieli & Young (2016 Ecta).

24 / 24

blue evolution & learning in games econ 243b · the family g n(a,b) exhibits fast convergence...

Documents