blue evolution & learning in games econ 243b · the family g n(a,b) exhibits fast convergence...
TRANSCRIPT
Evolution & Learning in GamesEcon 243B
Jean-Paul Carvalho
Lecture 17.Fast Convergence
1 / 24
The Very Long Run?
Based on notes by H. Peyton Young
Kreindler & Young (2013 GEB, 2014 PNAS)
I In the long run, the stochastic dynamic spends almost all the time in thestochastically stable states.
I However, the expected waiting time to reach a stochastically stablestate grows exponentially as the error rate ε becomes arbitrarily small.
I Is SS only relevant in the very long run?
2 / 24
Intermediate Error Rates
In this lecture, we see that for intermediate values of ε there is:
I Sharp selection.
I Fast convergence.
These results hold when agents respond to:
(a) the distribution of actions in the whole population,
(b) random samples from the population,
(c) their neighbors in a network.
3 / 24
Model
Population size N, large but finite.
2× 2 symmetric pure coordination game:
A BA 1 + α
1 + α0
0B 0
01
1
I (B,B) is the status quo.
I (A,A) is the innovation, with α > 0 the payoff gain to thedeviation.
4 / 24
General Coordination Game
Can generalize to any 2× 2 symmetric coordination game:
A BA a
ad
cB c
db
b
with α redefined as the normalized potential difference:
α =(a− d)− (b− c)
b− c.
5 / 24
Strategy Revisions
I Time is discrete.
I Each period lasts τ = 1N units of time.
I Each period, one randomly chosen agent revises:
Step 1. Gathers information on current play,
Step. 2 Chooses a (myopic) noisy best response.
6 / 24
Logit Learning
Step 1. Agent forms estimate x of adoption rate of A.
(a) Full information: agent knows the current proportion ofadopters in the population.
(b) Partial information: agent randomly samples d otherplayers.
Step 2. Noisy best response given by logit function:
Pr(Choose A|x) = eβ(1+α)x
eβ(1+α)x + eβ(1−x).
Error rate (at zero adoption) is ε = 11+eβ .
7 / 24
Logit Learning: α = 0.5, β = 3.5Logit Learning: a = 0.50
8 / 24
Logit Learning: α = 0.5, various βLogit Learning
9 / 24
Convergence Times
I Study process ΓN(α, β) starting from all- B state (statusquo).
I State variable: adoption rate x(t) of innovation A.
I Define waiting time to adoption level p < 1:
TN(α, β, p) = min{t : x(t) ≥ p}.
10 / 24
Sample Adoption Pathε = 5%, α = 100%, p = 99%, N = 1000
Sample adoption path
Time
11 / 24
Fast Convergence - definitions
Definition 1. The family ΓN(α, β) exhibits fast convergence if theexpected waiting time until a majority of agents play A isbounded independently of N, or
E TN(α, β, 1
2
)< S(α, β) for all N.
Definition 2. The family ΓN(α, β) exhibits fast convergence to p ifthe expected waiting time to adoption level p is boundedindependently of N, or
E TN (α, β, p) < S(α, β, p) for all N.
12 / 24
Fast Convergence - result
Theorem 16.1. Let
h(β) =eβ−1 + 4− e
β− 2 for β > 2
h(β) = 0 for 0 < β ≤ 2
If α > h(β), then ΓN(α, β) exhibits fast convergence.
13 / 24
Error Rate
Recall that the error rate ε is the probability of choosing A whenyou expect your opponent to choose B:
ε =e0
e0 + eβ.
I β = 2 means ε ≈ 12%.
I β = 3 means ε ≈ 5%.
14 / 24
Threshold for Fast Convergence
FAST
SLOW
Threshold for fast convergence
h(e) = estimate of boundary
a*(e)= actual boundary
15 / 24
Proof Strategy
Pr(Choose A|x) = f (x; α, β) =eβ(1+α)x
eβ(1+α)x + eβ(1−x).
Recall that the continuous-time mean logit dynamic is thedifferential equation:
x = f (x; α, β)− x with x(0) = 0.
The logit equilibria are the fixed points:
f (x∗; α, β) = x∗.
The key is to find combinations of α, β such that the lowestfixed point is greater than 1/2.
16 / 24
Proof StrategyThe status quo equilibrium disappears when α is big enough.Status quo equilibrium disappears when a is big enough
17 / 24
Proof StrategyLet x0 be the tangency point. Then:
f ′(x0; α, β) = 1 (1)
andf (x0; α, β) = x0. (2)
Note that:
f ′(x0; α, β) = β(2 + α)f (x0) (1− f (x0)) = 1. (3)
Combining (1) and (3) yields
x0(1− x0) =1
β(2 + α).
18 / 24
Proof StrategyIf x0 is small, x2
0 is very small.
Henceβ(2 + α)x0 ≈ 1 (4)
f (x0) =eβ(1+α)x0
eβ(1+α)x0 + eβ(1−x0)
=1
1 + eβ−β(2+α)x0≈ 1
1 + eβ−1
≈ x0 ≈1
β(2 + α).
Therefore,β(2 + α) ≈ eβ−1 + 1.
This defines the approximate combinations of α and β required to liftf (x) off the 45-degree line.
In fact, we need β(2 + α) > eβ−1 + 4− e.19 / 24
Average Waiting Times
ε = 5%, p = 99%
N = 100 N = 1000 N = 10, 000α = 70% 33 101 > 8, 000α = 80% 25 36 35
ε = 10%, p = 50% (top row), p = 90% (bottom row)
N = 100 N = 1000 N = 10, 000α = 4% 38 190 > 8, 000
α = 25% 19 20 21
20 / 24
Partial Information
Analogous results hold when agents draw random samplesfrom the population:
I Sample size d < ∞ fixed independently of N.
I Updating function becomes:
fd(x; α, β) =d
∑k=0
(dk
)xk(1− x)d−k f
(kd ; α, β
).
Theorem 16.2. Assume d ≥ 3. If α > min{h(β), d− 2}, theprocess exhibits fast convergence.
21 / 24
Payoff HeterogeneityNow suppose agents choose exact best responses, but payoffsare perturbed by small shocks:
∆ is the payoff gain from choosing A:
∆ = (1 + α)x− (1− x).
Let εA and εB be i.i.d. (idiosyncratic) payoffs from playing Aand B. Then:
Pr(Choose A|x) = Pr(εA − εB + ∆ > 0).
I If εA and εB are extreme value distributed, this is logitchoice.
I For comparison, let us consider normally distributedpayoff shocks, εA, εB ∼ N(0, 1/β).
22 / 24
Fast Convergence
Response function (left) and fast diffusion threshold (right).Normally distributed payoff shocks (gray dots), and extremevalue (black line) for d = 15.
23 / 24
Conclusion
Evolutionary selection occurs within realistic time framesfor plausible levels of error and/or payoff heterogeneity.
For the role of networks in fast convergence, see:
I Morris (2000 ReStud),
I Kreindler & Young (2014 PNAS),
I Arieli & Young (2016 Ecta).
24 / 24