slide 1 of 24 bayesian games matthew h. henry november 10, 2004 references 1.axlerod, robert. 1987....

24
Bayesian Games Matthew H. Henry November 10, 2004 References 1. Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s dilemma.” Genetic Algorithms and Simulated Annealing. (ed. D. Davis) London: Pitman, pp. 32-43. 2. Gibbons, Robert. 1992. Game Theory for Applied Economists. Princeton, New Jersey: Princeton University Press. 3. Harsanyi, John C. 1967. “Games with Incomplete Information Played by Bayesian Players, Parts I, II and III.” Management Science 14:159-182, 320-334, 486-502. 4. Sigmund, Karl. 1993. Games of Life – Explorations in Ecology, Evolution, and Behaviour. Oxford, England: Oxford University Press.

Upload: tiffany-mitchell

Post on 17-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Bayesian Games

Matthew H. HenryNovember 10, 2004

References

1. Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s dilemma.” Genetic Algorithms and Simulated Annealing. (ed. D. Davis) London: Pitman, pp. 32-43.

2. Gibbons, Robert. 1992. Game Theory for Applied Economists. Princeton, New Jersey: Princeton University Press.

3. Harsanyi, John C. 1967. “Games with Incomplete Information Played by Bayesian Players, Parts I, II and III.” Management Science 14:159-182, 320-334, 486-502.

4. Sigmund, Karl. 1993. Games of Life – Explorations in Ecology, Evolution, and Behaviour. Oxford, England: Oxford University Press.

Page 2: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Outline

• Static Games with Bayesian Players

– Example: Scalping Tickets

– Nash Equilibria for Matrix Games with Incomplete Information: Generals

– Nash Equilibria for Games with Asymmetric Information: Cournot Model

– Nash Equilibria for Games with Continuous Type Space: Auction

• Dynamic Games with Bayesian Players

– Perfect Bayesian Equilibrium for Games with Incomplete or Imperfect Information

– Example: 3-Player Game Tree

• Signaling Games

– Perfect Bayesian Equilibrium for Signaling Games

– Example: Job Market Signaling

Page 3: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Static Games with Incomplete Information

• Static games

– Players move simultaneously

– No observation of opponent move history

• Games with incomplete information

– One or more players lacks full information regarding the payoff functions and

strategies available

– We shall limit the information deficit to the player state (or type) knowledge

• Player type implies (and is implied by) payoff function

• Matrix games will have a unique payoff matrix for each player type match-up

Page 4: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Example: Scalping Tickets

• For example, consider a scenario in which you and the Cavalier are each

scalping tickets for beer money before the UVa-Miami football game

• For every discrete round of the game, each player assumes one of two types

and can take one of two actions (stand in one of two locations)– Types: Buyer or Seller

– Locations: in front of Durty Nellie’s Pub or at the Fry’s Spring Garage

• You know that you are either buying or selling and you know with probability

p that the Cavalier is buying, and selling otherwise

• Four payoff matrices for the four possible type match-ups

• Choose a spot to maximize profit (Durty Nellies or Fry’s Spring Garage)

based on your type and your best guess of the Cav’s type

Page 5: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

A Better Example from Harsanyi

• Consider two Generals A and B

– A seeks to maximize (maxmin) payoff and B seeks to minimize (minmax) payoff

– Fixed action profiles: (a1, a2) and (b1, b2)

– Each leads an army which assumes one of two states: Strong or Weak

• This yields four possible match-ups – (AS, BS), (AS, BW), (AW, BS), (AW, BW) –

with corresponding payoff matrices, each having its own Nash equilibrium:

2 5-1 20

b1 b2

a1

a2

(AS, BS)

-24 -360 24

b1 b2

a1

a2

(AS, BW)

28 1540 4

b1 b2

a1

a2

(AW, BS)

12 202 13

b1 b2

a1

a2

(AW, BW)

[Harsanyi]

Page 6: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Bayesian Players

• Each player knows his own state and estimates his opponents state

• Each player has a pure strategy for every possible match-up

• Each player forms a strategy based on the expected payoff

• To continue the example given by Harsanyi, consider the following

probabilities of occurrence for the four possible match-ups:

4/10 1/10

2/10 3/10

BS BW

AS

AW

Page 7: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Bayesian Nash Equilibrium

This yields the following payoff matrix and a single pure strategy Nash equilibrium:

AS a1, AW a1 B

S

b1,

BW

b1

7.6

7.0

8.8

8.2

8.8

9.1

13.6

13.9

6.2

1.0

14.6

9.4

7.4

3.1

19.4

15.1B

S

b1,

BW

b2

BS

b2,

BW

b1

BS

b2,

BW

b2

AS a1, AW a2

AS a2, AW a1

AS a2, AW a2

Example calculation: Bayesian Nash Equilibrium payoff = (.4)(-1) + (.1)(0) + (.2)(28) + (.3)(12) = 8.8

Page 8: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Interpretation of Bayesian Nash Equilibrium

• If Player A is Strong, he takes action a2 and a1 if Weak.

• Player B takes action b1 irrespective of state.

• Emerged from the known probabilities of each possible match-up

• Nash optimal : Best response (in a Bayesian sense) on the part of each player to

the actions available to his opponent

• Note that each player has a pure state-dependent strategy

(However, an outside observer could interpret it as a mixed strategy, with

Nature playing the part of a third indifferent player who randomly chooses

states for players A and B according to fixed probability distributions)

Page 9: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Static Bayesian Game #2: Cournot Model

• Consider a Cournot model comprising two firms A and B producing the same commodity to satisfy market demand, D.

• The commodity price on the market is given by

• Firm A’s cost of producing the commodity is cAqA

– cAis the marginal cost

– qA is the quantity that Firm A produces.

• Firm B’s cost of producing the commodity is

– cB1qB, with probability p

– cB2qB with probability (1-p).

• Player state defined by its marginal cost

• Each firm seeks to maximize its profit by anticipating the market price

otherwise0

qqD ifqqDP BABA

Page 10: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Cournot Model and Asymmetric Information

•Firm B knows its state and Firm A’s state

•Firm A knows its own marginal cost but can only estimate Firm B’s state

•Each firm knows of the other’s degree of knowledge

•Gibbons calls this a Bayesian game with asymmetric information

•Firm A chooses the optimal quantity qA to produce:

•Firm B chooses the optimal quantity qB to produce

For cB1:

For cB2

AABBAAABBAq

qccqqDpqccqqDpA

)(1)(max 21

1BBB1q

qqwhere,qmaxB

BBBA ccqqD

2BBB2q

qqwhere,qmaxB

BBBA ccqqD

Page 11: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Analytical Solution: Bayesian Nash Equilibrium

System of Equations:

Solutions:

122

2*

121

1*

21*

63

2)(

6

1

3

2)(

3

)1(2

BBAB

BB

BBAB

BB

BBAA

ccpccD

cq

ccpccD

cq

cppccDq

2 statein isit ifsolution sB' Firm ,2

)(

1 statein isit ifsolution sB' Firm ,2

)(

)(1)(

2*

2*

1*

1*

21*

BABB

BABB

ABBABBA

cqDcq

cqDcq

ccqDpccqDpq

Page 12: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Bayesian Game with Continuous Type Space: Auction

• Consider an auction comprising two bidders and one item

• Players offer bids, b1 or b2, for the item

• b1 & b2 [0, 1]

• Each bidder values the item at v1 or v2 with payoff v1– p or v2– p, respectyively

• v1 & v2 [0, 1]

jiiijiii

bbbPbvbbPbv

i

)(2

1)(max

Note: The latter term in this utility function applies only when bids are offered in fixed increments. For bids from the continuous set [0,1], this term is zero.

Page 13: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Linear Equilibrium

• We simplify the search for equilibrium by limiting the solution to the linear form

bi(vi) = ai + civi

• This does not limit the player action spaces to linear strategies, but simply looks for a

linear equilibrium solution

• We can assume that a player i will neither bid above the expected highest bid nor

below the lowest expected bid of player j

• Therefore, aj bi aj+cj, since vj[0,1] and is a uniformly distributed random variable

Page 14: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Linear Equilibrium

• This gives us:

otherwisea

avforav

vb

similarly

otherwisea

avforav

vb

or

avbc

abbv

and

c

abv

c

abPvcabPbbP

i

ijij

jj

j

jiji

ii

jiij

jiii

b

j

jij

j

jijjjiji

i

2)(

2)(

2max

Page 15: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Linear Equilibrium

• Since we are looking for a linear solution, ai and aj 0, since values greater

than zero would yield a non-linear solution or, if greater than 1, would yield

an infeasible solution since neither bidder will offer more than he values the

item.

• Thus, since the bids must be non-negative, ai = aj = 0, and the solution is that

each bidder will offer one half his valuation of the item.

Page 16: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Dynamic Games with Bayesian Players

• Dynamic games with incomplete or imperfect information

– Players move after observing the actions taken by their opponents.

– Recall from the initial discussion on static games that information incompleteness

implied an information deficit with respect to an opponent’s type or state

– Information imperfection implies that each successive player’s move is based on

complete information about the state of the other players but flawed information

about the state of the game; i.e., the play history on the part of his opponents

• These games require a new solution concept: perfect Bayesian equilibrium

Page 17: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Perfect Bayesian Equilibrium

Gibbons gives the following four requirements for a perfect Bayesian equilibrium:

1. For each game turn, the moving player must have a belief about the state of the game,

i.e. the play history to that point, in the form of a probability distribution over the set of

the possible game sub-states at that point.

2. Given their beliefs, the players’ strategies must be sequentially rational. Note: An example of irrational (but effective under some circumstances) strategy is tit-for-tat in

repeated prisoner’s dilemma games. [Axlerod, Sigmund]

3. At each game state on the equilibrium path, beliefs are formed by observation-driven Bayes’ rule and players’ equilibrium strategies. (For a given equilibrium in a sequential game, a game state is on the equilibrium path if it will be reached with positive probability when the game is played according to equilibrium strategies. Otherwise, the state is off the equilibrium path.)

4. For game states off the equilibrium path, beliefs are formed by Bayes’ rule and players’

equilibrium strategies where possible.

Page 18: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Simple Example

Consider the following 3-player Game Tree. Each set of nodes corresponding to outcomes

associated with any particular player’s move represents a possible game state.200

011

012

333

121

P1 A

D

P2

P3

RL

L’ R’ L’ R’

[p] [1-p]

R1. This requirement is relevant for P3 only since

if P1 chooses A, the game is over, and thus

P2 has only to believe that he is in state D if

he has a turn. Player 3 must conclude that p

= 1 since R is dominated by L for player 2.

R2. Given this belief, Player 3 must choose R’.

R3. This requirement is satisfied by R1.

R4. This requirement is trivially satisfied since

there are no states off the equilibrium path.

Thus, the equilibrium (D,L,R’) can be confirmed

by inspection.

Page 19: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Signaling Games

• Games of two players with incomplete information about the opponent’s type

• One player is the Sender, one is the Receiver.

• Nature draws a type for the Sender according to a probability distribution on the set of feasible types.

• The Sender observes his type and sends a message based on that type. The sender can follow pooling, separating or hybrid strategies.

– A pooling Sender transmits the same message regardless of type.

– A separating Sender always transmits different messages for each type.

• The Receiver observes the message but not the type and chooses an action.

• Payoffs to the Sender and receiver are each a function of Sender type, message and Receiver action.

Page 20: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Requirements for Perfect Bayesian Equilibrium in Signaling Games

1. After observing the Sender’s message, the Receiver must have a belief about

the Sender’s type in the form of a probability distribution conditional upon the

message transmitted.

2R.For each message observed, the Receiver’s action must maximize the

Receiver’s expected payoff, given the belief about the Sender’s type.

2S. For each type determined by Nature, the Sender’s message must maximize his

expected payoff, given the Receiver’s strategy, defined as the set of actions to

be taken as functions of the message transmitted.

3. For each message transmittable by the Sender, if there exists a sender type

such that the message is optimal for that type, then the Receiver’s belief about

the Sender’s type must be derivable from Bayes’ rule and the Sender’s

strategy.

Page 21: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Example: Job Market Signaling

• Nature determines a worker’s (the Sender) productive ability, which can be either High

or Low. The probability that his ability is High is q.

• The worker observes his ability and chooses a level of education (his message to

potential employers).

• The hiring market (the Receiver) observes the worker’s level of education and, based on

a belief about the worker’s ability, offers a wage (Receiver’s action).

• Payoff to the worker is W – C(a, e), where W is the wage offered, C is the cost (financial

+ intellectual difficulty) of attaining a particular level of education as a function of

ability a and education level e. Presumably, the cost of attaining a higher level of

education for a Low ability worker is relatively high due to the additional intellectual

difficulty sustained by the worker in its pursuit.

• Payoff to the hiring market is P(a, e) – W, where P is the level of productivity supplied

by the worker as a function of ability and education level.

Page 22: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Complete Information Solution

P(L,e)

P(H,e)ILIH

e

W

e*(L) e*(H)

W*(L)

W*(H)

Note the marginal cost of education is higher for a Low ability worker, thus he would require a higher relative salary to justify pursuing a higher education, hence the steeper indifference curve.

The Productivity lines are found from the Nash solution W(e) = P(,e) in which the market, which is presumed to be competitive and therefore devoid of excess profit, offers a wage equal to the expected level of productivity.

Page 23: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Pooling Equilibria and the Power of Envy

• Suppose now that the hiring market has incomplete information about the

worker’s type and only observes the level of education attained by the

workers.

• Suppose further that a Low ability worker is envious of a High ability

worker’s salary and decides to attempt to masquerade as a High ability worker

by getting a more advanced degree.

• This constitutes a pooling strategy since the worker will attempt to signal to

the hiring market that he is of High ability irrespective of type.

Note, this is only rational if the following inequality holds:

W*(H) - C[L,e*(H)] > W*(L) – C[L,e*(L)]

Page 24: Slide 1 of 24 Bayesian Games Matthew H. Henry November 10, 2004 References 1.Axlerod, Robert. 1987. “The evolution of strategies in iterated prisoner’s

Masquerading Workers with Pooling Strategies

P(L,e)

qP(H,e) + (1-q) P(L,e)

ILIH

e

W

e*(L) e*p

W*(L)

W*p

Here the Nash equilibrium sets the wage at wp, where the expected Productivity line intersects both indifference curves.

P(H,e)