emergence of cooperation through coevolving time scale in spatial prisoner’s dilemma

© 2002 IBM Corporation

Donghua UniversityDonghua University

Emergence of cooperation through coevolving time scale in spatial prisoner’s dilemma

Zhihai Rong ( 荣智海 )[email protected] Donghua University

2010.08@The 4th China-Europe Summer School on Complexity Science, Shanghai

DHUDHU Donghua UniversityDonghua University

2

Acknowledgements

Dr. Zhi-Xi Wu

Dr. Wen-Xu Wang

Dr. Petter Holme

Zhi-Xi Wu, Zhihai Rong & Petter Holme, Phys.Rev.E,036106,2010

Zhihai Rong, Zhi-Xi Wu & Wen-Xu Wang, Phys.Rev.E,026101, 2010


3

阿豺折箭戮力一心

阿豺有子二十人。阿豺谓曰：“汝等各奉吾一支箭。”折之地下。俄而命母弟慕利延曰：“汝取一支箭折之。”慕利延折之。又曰：“汝取十九支箭折之。”延不能折。阿豺曰：“汝曹知否？单者易折，众则难摧，戮力一心，然后社稷可固！”

——《魏书•吐谷浑传》


4

Cooperation: the basis of human societies

Robert Boyd and Sarah Mathew, A Narrow Road to Cooperation, SCIENCE,2007


5

Prisoner’s dilemma ( 囚徒困境 ,PD)

Cooperator: help others at a cost to themselves.Defector: receive the benefits without providing help.

Whatever opponent does, player does better by defecting…

C DC (-2,-

2)(-5,-1)

D (-1,-5)

(-3,-3)


6

Some rules for evolutions cooperation

Nowak MA (2006). Five rules for the evolution of cooperation. Science

Kin selection: relativeHamilton, J. Theor. Biol.7 (1964)

Direct reciprocity: unrelated individuals Tit for tat(TFT): nice, punishing, forgiving, but for

noise… Axelrod & Hamilton, Science 211, (1981)

Win stay, lost shift(WSLS) Nowak, Sigmund, Nature 364, (1993)

Indirect reciprocity: reputationNowak, Sigmund, Nature 437 (2005).

Network reciprocity


7

Spatial Game Theory M. Nowak and R. May, Evolutionary games and spatial chaos,Nature

1992

Each player x occupying a site on a network

playing game with neighbors and obtaining payoff: Px(t)

updating rule( replicator dynamics): select a neighbor and learn its behavior with probability ~ f(Py(t)-Px(t))

player2

player1

C D

C 1 0

D 0

:1 2

: the temptation to defection

b

PD b

b


8

Evolutionary games on graphs G. Szabo&G. Fath, Evolutionary games on graphs, Phys. Rep. 446, 2007

Cooperator frequency fc

Game Rule

Selection ruleBest take overRandomPreferential …

PD,SG,SH,UG,PGG, Rock-paper-scissors…

Evolutionary Rule Structure & property

Replacement rulereplicator dynamics W(xy) =f(Py-Px)Fermi dynamics: W(xy)=(1+exp(x-y/κ))-1

Win stay, lost shiftMemory …

Lattice, random graph, small-world, scale-free…<k>, γ, rk , CC, community


9

Diversity of lifetime (time scale)C.Roca, J.Cuesta, A.Sánchez (2006),Physical review letters, vol.97, pp.158701.

Z.X.Wu, Z.H.Rong, P.Holme (2009), Physical Review E, vol.80, pp.36106.

The interaction time scale — how frequently the individuals interact with each other

The selection time scale — how frequently they modifies their strategies

The selection time scale is slower than the interaction time scale, the player has a finite lifetime.

Individuals local on a square lattice.The fitness of i at t-th generation: fi(t)=afi(t-1)+(1-a)gi ,

where -- gi is the payoff of i -- a characterizes the maternal effects.With probability pi, an individual i is selected to update

its strategy:

where κ characterizes the rationality of individuals, and is set as 0.01.

1/pi is the lifetime of i’s current strategy, f(0)=1.

1( )

1 exp[( ) / ]i ji j

W s sf f


10

Some key quantities to characterize the cooperative behaviors

Frequency of cooperators: fc

The extinction threshold of defectors/cooperators:bc1 and bc2

player2

player1

C D

C 1 0

D 0

:1 2

b

PD b

AllD

AllC

C & Dcoexist

DHUDHU Donghua UniversityDonghua UniversityMonomorphic time scale

a↗fc ↗ Optimal fc occurs at p=0.1 for a=0.9p1, C is frequently exploited by D.

P0, Ds around the boundary have enough time to obtain a fitness high enough to beat Cs.

Coherence resonance M. Perc, New J. Phys. 2006,M. Perc & M. Marhl,New J. Phys. 2006 J. Ren, W.-X. Wang, & F. Qi, Phys. Rev. E 75,2007

11

DHUDHU Donghua UniversityDonghua UniversityPolymorphic time scaleThe leaders are the individual with low p the followers are the individual with high p.v% of individuals’ p are 0.1, and others’ p are 0.9.

v=0.5, a=0.9, b=1.1, fc ≈0.712


13

Coevolving time scaleZ.H.Rong, Z.X. Wu, W.X.Wang, Emergence of cooperation through coevolving time

scale in spatial prisoner's dilemma, submitted to Physical Review E , 82, 026101 , 2010

“win-slower, lose-faster” rule: i updates its strategy by comparing with neighbor j with a

different strategy with probability

If i successfully resists the invasion of j, the winner i is rewarded by owing longer lifetime: pi=pi-β, where β is reward factor

If i accepts j's strategy, the loser i has to shorten its lifetime: pi=pi+α, where α is punishment factor

0.1 ≤ pi≤1.0, initially pi=1.0, κ=0.01

What kind of social norm parameters (α,β) can promote the mergence of cooperation?

1( )

1 exp[( ) / ]i ji j

W s sf f


14

a

High time scale C(p>0.5) High time scale D(p>0.5)Low time scale C (p≤0.5) Low time scale D(p ≤0.5)

(α, β)=(0.0,0.1) (α, β)=(0.2,0.1)

(α, β)=(0.9,0.1)Long-term C cluster

(α, β)=(0.9,0.05)short-term C cluster

(α, β)=(0.9,0.9)Long-term D cluster

The extinction threshold of cooperators, rD


15

α=0, increasing β(reward)

Initially p=1, pmin=0.1

High time scale C High time scale D

Low time scale C Low time scale D

t=100 t=50000


16

a

High time scale C High time scale DLow time scale C Low time scale D

(α, β)=(0.0,0.1) (α, β)=(0.2,0.1)

(α, β)=(0.9,0.1)


17

β =0.1, increasing α(punishment)

(α,β)=(0.1,0.1)

(α,β)=(0.9,0.1)

α↗, fc↗Feedback mechanism for

C/D:Winner Cfc↗fintess↗

Winner Dfc↘fintess↘α↗, their losing D neighbors

have greater chance to becoming C, hence cooperation is promoted.

b=1.05


18

a

High time scale C High time scale DLow time scale C Low time scale D

(α, β)=(0.0,0.1) (α, β)=(0.2,0.1)

(α, β)=(0.9,0.1)

(α, β)=(0.9,0.05)

(α, β)=(0.9,0.9)


19

(α,β)=(0.9,0.1)

α =0.9, increasing β(reward) (α,β)=(0.9,0.9)

(α,β)=(0.9,0.05)


Coevolution of Teaching activity

A. Szolnoki and M. Perc, New J. Phys. 10 (2008) 043036A. Szolnoki,et al.,Phys.Rev.E 80(2009) 021901

20

The player x will adopt the randomly selected neighbor y’s strategy with:

wx characterizes the strength of influence (teaching activity) of x. The leader with wx 1.

Each successful strategy adoption process is accompanied by an increase in the donor’s teaching activity:

If y succeeds in enforcing its strategy on x, wywy+Δw.A highly inhomogeneous distribution of influence may emerge.

1( )

1 exp[( ) / ]x y yx y

W s s wP P


Multiplicative “win-slower, lose-faster”

“win-slower, lose-faster” rule: i updates its strategy by comparing with neighbor j

with a different strategy:If i successfully resists the invasion of j, the winner i is rewarded by owing longer lifetime: pi=max(pi/β, pmin)

If i accepts j's strategy, the loser i has to shorten its lifetime: pi=min(pi*α,pmax)

pmin=0.1 and pmax=1.0

21

The extinction threshold of cooperators, rD


22

The extinction threshold of cooperationFor loser:α↗

For winner: βmidThe additive-increase /multiplicative-decrease (AIMD) algorithm in the TCP congestion control on the Internet

Jacobson, Proc. ACM SIGCOMM' 88 The extinction threshold of cooperators, rD


Conclusions

The selection time scale is slower than the interaction time scale.

Both the fixed and the coevolving time scale.

“win-slower, lose-faster” rule

The potential application in the design of consensus protocol in multi-agent systems.

23


24

东华大学http://cist.dhu.edu.cn/index.asp

东华大学位于上海松江区，原名中国纺织大学，是国家教育部所属的 211全国重点大学，也是我国首批具有博士、硕士、学士三级学位授予权的大学之一。

信息学院现有“控制理论与控制工程 (90)”和“模式识别与智能系统 (02)”2 个博士点以及 7 个硕士点，“控制科学与工程 (03)”一级学科博士后流动站，拥有“教育部数字化纺织服装技术工程研究中心”。

信息学院现有教职工近 120人，其中校特聘教授 2 人，长江特聘讲座教授 1 人，博士生导师 16人，具有正高级职称 25人，副高级职称 41人。


25

THANKS!Discussing

Rong Zhihai ( 荣智海 ) ： [email protected]

Department of Automation, DHU

emergence of cooperation through coevolving time scale in spatial prisoner’s dilemma

Documents