dna starts to learn poker david harlan wood 4 * hong bi 1 steven o. kimbrough 2 dongjun wu 3
DESCRIPTION
DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3 Junghuei Chen 1* Departments of 1 Chemistry & Biochemistry and 4 Computer & Information Sciences University of Delaware 2 The Wharton School, University of Pennsylvania - PowerPoint PPT PresentationTRANSCRIPT
DNA Starts to Learn Poker
David Harlan Wood4*
Hong Bi1
Steven O. Kimbrough2
Dongjun Wu3
Junghuei Chen1*
Departments of 1Chemistry & Biochemistry and 4Computer & Information Sciences
University of Delaware
2The Wharton School, University of Pennsylvania
3Benett S. Lebow College of Business, Drexel University
Player Dealt an Ace
Ace
Say Ace(adds $1)
Player
Dealer Call(adds $1)
Fold
Losses $ 1
Deal
Loses $2
2
Say Ace (adds $1)
Say 2 Player
Dealer
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Player dealt a 2
Ace 2
Say Ace(adds $1)
Say Ace (adds $1)
Say 2 Player
Dealer Call(adds $1)
Fold
Losses $ 1
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Player dealt an Ace Player dealt a 2
Loses $2
OBJECTIVE: To Obtain Probabilistic Strategies
Each player wants to obtain a strategy for the game.
A strategy prescribes an action in every possible situation.
That is, at each node, raising as a function of hand dealt.
Poker
Play New Game
New DealerStrategies
Deals
Assemble
New PlayerStrategies
Learning
Separate by Payoffs
ProgrammableSelection of Recovered Dealer Strategies
ProgrammableSelection of Recovered Player Strategies
Dealer’s Adaptation
Player’s Adaptation
Amplify
Crossover
Mutate
Amplify
Crossover
Mutate
Recover & DistributeStrategies
Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies
Player’s StrategiesDealer’s Strategies
Learning Poker
Play New GameSeparate by Payoffs
ProgrammableSelection of Recovered Dealer Strategies
ProgrammableSelection of Recovered Player Strategies
Dealer’s Adaptation
Player’s Adaptation
Amplify
Crossover
Mutate
New DealerStrategies
Amplify
Crossover
Mutate
Deals
Assemble
New PlayerStrategies
Recover & DistributeStrategies
Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies
Player’s StrategiesDealer’s Strategies
R.E. 1
Dealer’s Strategies
R.E. 2
Stopper Stopper
Say A’ FOLD’Call’Fold’
Player’s Strategies
R. E. 1
2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’
StopperStopper Stopper
2
Dealt 2
R.E. 2
A
R.E. 2
Ace 2
Say Ace(adds $1)
Say Ace (adds $1)
Say 2 Player
Dealer Call(adds $1)
Fold
Losses $ 1
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Loses $2
Sequences from: Sakamoto, et. al, DNA4 (1997)
Dealt A
Dealer’s Strategies
Player’s Strategies
Deals
Two Strategies and a Deal Define a Game
Ace Dealt
A Player’s Strategy
R. E. 1
2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’
A Dealer’s Strategy
R.E. 1 R.E. 2
Say A’ FOLD’Call’Fold’
A
R.E. 2
Cut with R.E.1 & R.E.2 and Assemble A Game
Player’s Strategy Dealer’s Strategy Deal
2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’
2’Say 2’ Fold’ErrorSay A’ A’Say A’
R. E. 1
Say A’ Call’Fold’
R.E. 2
A
SAY 2’
FOLD’
Cut with R.E.1 & R.E.2 and Assemble A Game
Player’s Strategy Dealer’s Strategy Deal
2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’
2’Say 2’ Fold’ErrorSay A’ A’Say A’
R. E. 1
Say A’ Call’Fold’
R.E. 2
A
SAY 2’
FOLD’
Two Strategies and a Deal Define a Game
Ace Dealt
A Player’s Strategy
R. E. 1
2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’
A Dealer’s Strategy
R.E. 1 R.E. 2
Say A’ FOLD’Call’Fold’
A
R.E. 2
Player’s Strategy Dealer’s Strategy Deal
2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’
74-mer (S1) 57-mer (S2) 48-mer (S3) 53-mer (S4)
L1 (25 mer) L3 (28 mer)L2 (28 mer)
S1 S2 S3 S4 R1 R2 M
R1: Ligation Reaction R2: Purified Ligation Product
50
75
100
150
200225232
Ace
Say Ace(adds $1)
Say 2 Player
Dealer Call(adds $1)
Fold
Losses $ 1
Deal
Player dealt an Ace
Player Says A
Dealer Folds
Dealer MIGHT Change to Call
Loses $2
Player Dealt an Ace
2’Say 2’ Fold’ErrorSAY 2’Say A’ A’Say A’ Say A’ FOLD’Call’Fold’ A
Player’s Strategy Dealer’s Strategy Deal
Player Says Ace
A’Say A’
Extend(Say A) A
Player’s Strategy
Extend(Fold)
Say A’Fold’
Say ADealer Folds
Dealer’s Strategy
Extend(Call)
Dealer MIGHT Change to Call
Fold’ FOLD’Call’
FoldPreventer
Dealer’s Strategy
Error
Player Says Ace
A’Say A’
Extend(Say A) A
Extend(Fold)
Say A’Fold’
Say ADealer Fold
Extend(Call)
Dealer MIGHT Change to Call
Fold’ FOLD’Call’
FoldPreventer
200
225
250
275300
(232-mer)
(247-mer)
(262-mer)
(282-mer)
2
Say Ace (adds $1)
Say 2 Player
Dealer
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Player dealt a 2
Player Says 2
(Block Say 2)
Player Changes to Say A
Dealer Changes to Call
Dealer Folds
Player Dealt a 2
22’Say 2’ Fold’Error SAY 2’Say A’ A’Say A’ Say A’ FOLD’Call’Fold’
Player’s Strategy Dealer’s Strategy Deal
Dealer MIGHT Change to Call
FOLD’Call’
FoldExtend(Call)
Fold’Error
Preventer
Dealer’s Strategy
Dealer FoldsExtend(Fold)
Say A’Fold’
Say A
Dealer’s Strategy
Player MIGHT Change to Say Ace
Player’s Strategy
SAY 2’Say A’
Extend(Say A) Say 2
Player Says 2
Say 2’ 2’
Extend(Say 2) 2
Player’s Strategy
Ace 2
Say Ace(adds $1)
Say Ace (adds $1)
Say 2 Player
Dealer Call(adds $1)
Fold
Losses $ 1
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Player dealt an Ace Player dealt a 2
Player Says A
Dealer Folds
Dealer MIGHT Change to Call
Loses $2
Dealer MIGHT Change to Call
Dealer Folds
Player MIGHT Change to Say Ace
Player Says 2
Learning Poker
Play New GameSeparate by Payoffs
ProgrammableSelection of Recovered Dealer Strategies
ProgrammableSelection of Recovered Player Strategies
Dealer’s Adaptation
Player’s Adaptation
Amplify
Crossover
Mutate
New DealerStrategies
Amplify
Crossover
Mutate
Deals
Assemble
New PlayerStrategies
Recover & DistributeStrategies
Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies
Player’s StrategiesDealer’s Strategies
Separate by Payoffs
ProgrammableSelection of Recovered Dealer Strategies
Dealer’s Adaptation
Amplify
Crossover
Mutate
Recover & DistributeStrategies
Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies
Player’s StrategiesDealer’s Strategies
Strategies are returnedgrouped by outcomes:-$ 2, - $ 1, + $ 1, + $ 2.
Select Dealer’s ownPreferred mix of strategies to be bred
Breed by using PCR to restore population size using a variablemutation rate.
Crossover by pairwise recombiningof “change your mind” regions.
Learning
Ace 2
Say Ace(adds $1)
Say Ace (adds $1)
Say 2 Player
Dealer Call(adds $1)
Fold
Losses $ 1
Losses $ 1
Call(adds $1)
Fold
Losses $ 1
Wins $ 2
Deal
Player dealt an Ace Player dealt a 2
Loses $2
OBJECTIVE: To Obtain Probabilistic Strategies
Each player wants to obtain a strategy for the game.
A strategy prescribes an action in every possible situation.
That is, at each node, raising as a function of hand dealt.
Complexity
Our complexity is linear in the number of nodes in the tree # nodes in tree = 2 players + betting rounds
At each node, we need a probability distribution giving “level of bet” as a function of “dealt hand”.
For us, probability distribution is substituted by probabilistichybridization of DNA encoded “dealt hand” to adapting“change you mind about folding” region of strategy.
The output (if generated) is an adapting “level of bet”region of strategy.
handbetnext
next’
bet generator
next
Extend
bet’ hand’
hand evaluator
Comparison
Koller and Pfeffer derive equilibrium mixed strategies withcomplexity polynomial in
# nodes * # possible deals * 2 betting levels
“Representations and Solutions for Game-Theoretic Problems,”Artificial Intelligence (1997)
• Two-player games only• Don’t exploit weakness of opponent• No dynamics, only equilibrium
Player 1
Player 2
Player 3 22
222
2
2
22
222
3-Player Poker: All Possible Deals
Course of Play
P1
P2
P3
P3
P2
P1
Pass Bet $ a
Pass
Pass Bet $ a
Bet $ a
F C
F C F CF C
F C
F C F C
F C F C
C: Call (add $ b) F: Fold
Learning Poker
Recover Dealer’s & Player’s Strategies
Play New GameSeparate by Payoffs
ProgrammableSelection of Recovered Dealer Strategies
ProgrammableSelection of Recovered Player Strategies
Dealer’s Adaptation
Player’s Adaptation
Amplify
Crossover
Mutate
New DealerStrategies
Amplify
Crossover
Mutate
Deals
Assemble
New PlayerStrategies
A
AA
A
AA
2
22
2
22
2
2
22
A
A
A
AA
A
AA
2
22
2
22
2
2A
A
AA
3
33
3
33
3
33
3
33
3
33
3
33
Dealer MIGHT Change to Call
Dealer Folds
Player MIGHT Change to Say Ace
Player Says 2