learning to make better strategic decisionsecardell/papers/learning_to_make...learning to make...

23
Learning to Make Better Strategic Decisions Eric Cardella y September 25, 2011 Abstract Strategic settings are often complex and agents who lack deep reasoning ability may initially fail to make optimal decisions. This paper experimentally investigates how the decision making quality of an agents opponent impacts learning-by-doing (LBD) and learning-by-observing (LBO) in a strategic deci- sion making environment. Specically, does LBD become more e/ective when agents face an opponent who exhibits optimal decision making? Similarly, does LBO become more e/ective when agents observe an opponent who exhibits optimal decision making? I consider an experimental design that enables me to measure strategic decision making quality, and control the decision making quality of an agents opponent. The results suggest that LBD is more e/ective when facing an optimal decision making opponent. Whereas, LBO does not ap- pear to be more e/ective when observing an optimal decision making opponent. The results also suggest that LBD is, at most, marginally more e/ective that LBO at improving decision making quality in the game considered. I would like to thank my dissertation advisor Martin Dufwenberg for helpful comments. I would also like to thank Anna Breman, David Butler, Tamar Kugler, John Wooders, and conference participants at the University of Arizona for additional helpful comments. I am grateful to the Economic Science Laboratory at the University of Arizona for providing nancial support. y Department of Economics, University of Arizona, McClelland Hall 401, PO Box 210108, Tucson, AZ 85721-0108; Email: [email protected]. 1

Upload: others

Post on 28-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

Learning to Make Better Strategic Decisions�

Eric Cardellay

September 25, 2011

Abstract

Strategic settings are often complex and agents who lack deep reasoningability may initially fail to make optimal decisions. This paper experimentallyinvestigates how the decision making quality of an agent�s opponent impactslearning-by-doing (LBD) and learning-by-observing (LBO) in a strategic deci-sion making environment. Speci�cally, does LBD become more e¤ective whenagents face an opponent who exhibits optimal decision making? Similarly, doesLBO become more e¤ective when agents observe an opponent who exhibitsoptimal decision making? I consider an experimental design that enables meto measure strategic decision making quality, and control the decision makingquality of an agent�s opponent. The results suggest that LBD is more e¤ectivewhen facing an optimal decision making opponent. Whereas, LBO does not ap-pear to be more e¤ective when observing an optimal decision making opponent.The results also suggest that LBD is, at most, marginally more e¤ective thatLBO at improving decision making quality in the game considered.

�I would like to thank my dissertation advisor Martin Dufwenberg for helpful comments. Iwould also like to thank Anna Breman, David Butler, Tamar Kugler, John Wooders, and conferenceparticipants at the University of Arizona for additional helpful comments. I am grateful to theEconomic Science Laboratory at the University of Arizona for providing �nancial support.

yDepartment of Economics, University of Arizona, McClelland Hall 401, PO Box 210108, Tucson,AZ 85721-0108; Email: [email protected].

1

Page 2: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

1 Introduction

Economic settings are often complex, and optimal decision making can require deepreasoning ability. Oligopolies, negotiations, contracting, and auctions represent afew of the many complex economic settings where agents are called upon to makeimportant strategic decisions. Agents who lack high levels of strategic sophisticationand/or deep reasoning ability are likely to initially make sub-optimal decisions, whichcan often lead to ine¢ cient outcomes. As a result, investigating how agents learn tomake better decisions remains an important and largely open research question. Themotivation of this paper is to investigate learning in strategic settings and provideinsights regarding how agents can possibly become better strategic decision makers.

In relation to single agent decision tasks, one learning mechanism that can fa-cilitate improved decision making is learning-by-doing (LBD). By repeatedly doinga decision task, agents acquire knowledge and skill that can subsequently lead tobetter decision making. In a seminal paper, Arrow (1962) argues that �learning isthe product of experience. Learning can only take place through the attempt tosolve a problem and therefore only takes place during activity�(p. 155). I refer thereader to Thompson (2010) for a comprehensive review of the extensive literatureon LBD, including theoretical applications and empirical investigations supportingLBD. An alternative, yet related, learning mechanism that can facilitate improveddecision making in single agent decision tasks is learning-by-observing (LBO). Byrepeatedly observing the decision making of another, agents acquire knowledge andskill that can subsequently lead to better decision making. For example, Jovanovicand Nyarko (1995) present a model of LBO where an �apprentice� learns from theskillful �foreman�he observes. Merlo and Schotter (2003) and Nadler et al. (2003)provide experimental evidence that supports LBO.1

In relation to strategic games, LBD would correspond to the acquisition of knowl-edge and skill by repeatedly playing the game. I refer to this equivalent of LBD ina game as strategic LBD. Similarly, LBO in a strategic game would correspond tothe acquisition of knowledge and skill by repeatedly observing another agent play thegame. I refer to this equivalent of LBO in a game as strategic LBO. Games, unlikesingle agent decision tasks, involve decision making of multiple agents. Therefore, itis possible that the e¤ectiveness of strategic LBD and strategic LBO will be in�u-enced by the decision making of the other agents in the game. The �rst motivationof this study is to experimentally investigate how strategic LBD and strategic LBOare in�uenced by the decision making quality of an agent�s opponent. Speci�cally, Iinvestigate whether strategic LBD becomes more e¤ective when an agent repeatedlyplay against an opponent who makes optimal decisions, compared to sub-optimal de-cisions. Similarly, I investigate whether strategic LBO becomes more e¤ective when

1LBO has also been well documented in several animal experiments including John et al. (1969),Tomasello et al. (1987), and Terkel (1996).

2

Page 3: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

an agent repeatedly observes an agent who plays an opponent who makes optimaldecisions, compared to sub-optimal decisions.

To shed light on these questions, I propose a stylized experimental design, whichuses a 2-player, sequential-move game that features a dominant strategy. A full de-scription of the experimental design follows in the subsequent section. The dominantstrategy of the chosen game serves as an identi�able proxy for optimal strategic deci-sion making. The design also features the implementation of pre-programmed com-puter opponents, which enables me to explicitly control the decision making qualityof each subject�s opponent.2 In particular, I consider two types of computer oppo-nents: The �rst, which I refer to as the optimizing opponent, is pre-programmed toplay a dominant strategy, i.e., make optimal decisions. The second, which I refer toas the naïve opponent, is pre-programmed to not play a dominant strategy, i.e., makesub-optimal decisions. In the experiment, some subjects exclusively play the game,while other subjects initially observe a subject playing the game and then play thegame themself. This variation in whether subjects initially play the game or observeplay of the game, in combination with the variation of the decision making quality ofthe computer opponent, allows me to identify how strategic LBD and strategic LBOare impacted by the opponent�s decision making quality.

I �nd that subjects who initially played against the optimizing opponent makesigni�cantly better decisions that subjects who initially played against the naïve op-ponent. However, I �nd very little di¤erence in the decision making quality betweensubjects who initially observed another subject playing the optimizing opponent andsubjects who initially observed another subject playing the naïve opponent. Theseresults suggest that the e¤ectiveness of strategic LBO may not be a¤ected by thedecision making quality of the opponent, while the e¤ectiveness of strategic LBD canpossibly be increased by playing against an opponent who makes an optimal decisions.

As a second motivation of this study, I experimentally compare the e¤ectivenessof strategic LBD and strategic LBO. That is, I compare the decision making qualityof subjects who initially play the game with that of subjects who initially observeanother subject play the game; The design allows me to investigate this comparisonfor both the optimizing opponent and naïve opponent. This is similar in spirt to Merloand Schotter (2003) who experimentally compare LBD and LBO in a single-agentpro�t maximization problem.3 In their setting, Merlo and Schotter �nd that subject

2The use of pre-programmed computer opponents is certainly not novel to this study. For exam-ple, Johnson et al. (2002) who use pre-programmed computer opponents in an alternating bargaininggame. Merlo and Schotter (2003) use pre-programmed computer opponents in a tournament game.Shachat and Swarthout (2004) use pre-programmed computer opponents in a matching penniesgame. Dursch et al.(2010) use pre-programmed computer opponents in a repeated Cournot game.

3Technically, the authors consider a 2-player simultaneous move tournament game. However, theauthors e¤ectively transform the 2-player game into a single agent pro�t maximization problem byinforming subjects that they will face a computer opponent that is pre-programmed to always makethe same pre-speci�ed decision.

3

Page 4: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

who initially observe learn better than subjects who initially do. In the strategicgame that I consider, I �nd very little di¤erence between the decision making qualityof the subjects who initially observe and the subjects who initially play. The resultssuggest that strategic LBD and strategic LBO appear to be comparably e¤ectivemechanisms for making better decisions. Because this study compares LBD andLBO in a strategic setting, the results should be viewed as complementary to thoseof Merlo and Schotter.

It is important to note that in a game with multiple decision makers, agents whorepeatedly play the game are going to also observe the decisions made by the otheragents playing the game, i.e., observation of the opponent is naturally embedded intoplaying a multi-decision maker game. In this regard, strategic LBD includes the e¤ectof simultaneously observing the opponent while playing. Therefore, when I comparestrategic LBD with strategic LBO, I am actually identifying the compound e¤ect ofplaying the game and observing the opponent, compared to observing an agent playthe game and observing the opponent.4 But because subjects are playing a game, itimpossible to isolate the e¤ect of playing from observing the opponent. At the sametime, because subjects who play necessarily observe the opponent, I contend thatthe compound e¤ect is the appropriate and meaningful e¤ect to investigate whencomparing LBD and LBO in a strategic setting.

The goal of this study is to gain insights regarding how agents learn to make betterstrategic decisions. Ideally, one would want to gain such insights by experimentingin strategic settings in the �eld. However, identifying and measuring decision makingquality in the �eld can be di¢ cult. Optimal decisions often depend on the agent�spreferences and/or beliefs, both of which are di¢ cult to measure. Without this in-formation, it may be di¢ cult to measure the quality of an agent�s decisions in the�eld. A similar line of reasoning applies to the di¢ culty of identifying the decisionmaking quality of an agent�s opponent in �eld settings. By considering an experi-mental game that features a dominant strategy, I am able to identify and measureoptimal strategic decision making. Additionally, the lab provides a setting where itis possible to systematically control the decision quality of an agent�s opponent byusing pre-programmed computer opponents. Therefore, a stylized lab experiment en-ables me to gain insights regarding the impact of the opponents decision makingquality on strategic LBD and strategic LBO, which would otherwise be di¢ cult byexperimenting in the �eld.

The chosen experimental game features a dominant strategy and this makes itatypical of many strategic settings in the �eld. Games that feature a dominantstrategy are similar to intellective task, in the sense that there exists a demonstrablycorrect solution (Laughlin (1980), Laughlin and Adamopoulos (1980), and Laughlinand Ellis (1986)). This similarity is relevant in light of several studies that haveshown di¤erences in learning between intellective tasks and judgement tasks (Hill

4I thank an anonymous referee for calling to my attention this valid and important point.

4

Page 5: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

(1982), Hastie (1986), Levine and Moreland (1998), and Laughlin et al. (2002) fora thorough review). I acknowledge that using a game featuring a dominant strategymay limit the generalizability (Levitt and List (2007)) of the insights gleaned fromthe experimental results regarding learning in general strategic setting. Nevertheless,even at its most limited scope, this study can help better our understanding of LBDand LBO in strategic settings that feature a dominant strategy. Such insights can bevaluable when applied to games in the lab featuring dominant strategies, e.g., 2-playerguessing games (Grosskopf and Nagel (2008) and Chou et al. (2009)), and games inthe �eld featuring dominant strategies, e.g., second-price auctions.5

Although the experimental design features a stylized strategic settings, the resultsfrom this study can be useful for informing a broader range of topics including, butnot limited to: the design of e¤ective teaching modules, the design of e¤ective jobtraining programs, or perhaps on a more casual level, a lucrative poker career.

The paper proceeds by formally describing the experimental design and developingthe research hypotheses in Section 2. The results are presented in Section 3, and Iconclude with discussion in Section 4.

2 Experimental Design and Research Hypotheses

2.1 The Game of 21

I consider a 2-player, sequential move, constant-sum game of perfect information.The game, introduced for experimental purposes by Dufwenberg et al. (2010) (DSBhenceforth), is the game of 21 (G21 henceforth). This game begins with the �rstmover choosing either 1 or 2. After observing the �rst mover�s choice, the secondmover chooses to increase the count by either 1 or 2. For example, if the �rst moverchooses 2, the second mover can choose either 3 or 4. The players continue alternatingturns increasing the count by either 1 or 2; the player who chooses 21 wins.6 What isthe optimal way to play G21?

Upon some re�ection, one realizes that G21 features a second mover advantage,

5Grosskopf and Nagel (2008) speak to the importance of investigating learning in games withdominant strategies by noting in their conclusion that �it remains to be investigated whether andhow people can learn to choose zero in the n=2 BCG�(p. 98).

6Gneezy et al. (2010) concurrently introduced two related games for experimental purposes,which the authors refer to generally as �race games�. In their G(15,3) (G(17,4)) race game, subjectsalternate incrementing the count by 1, 2, or 3 (1, 2, 3, or 4), and the �rst person to reach 15 (17)wins. Levitt et al. (forthcoming) consider two versions of the �race to 100�game where subjectsalternate incrementing the count by either 1-9 or 1-10 and the person to reach 100 wins. Dixit(2005) refers to an empirical account of a related game, the 21 �ags game, which appeared on anepisode of the TV series �Survivor�in 2002 as an immunity challenge between two teams. The twoteams alternated removing 1, 2, or 3 �ags from an initial pile of 21 �ags. These related �race�gamesfeature a similar structure and strategic properties to those of G21.

5

Page 6: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

where the second mover can guarantee victory by choosing every multiple of three,i.e., choosing 3, 6, 9, 12, 15, 18, and then 21 to win the game. Thus, any strategythat prescribes choosing every multiple of three is dominant for the second mover.7

In addition, any subgame that contains an available multiple of three also featuresa dominant strategy of choosing that multiple of three followed by all subsequentmultiples of three. That is, if one of the players fails to choose a multiple of three,the other player could guarantee victory by choosing that multiple of three and allsubsequent multiples of three. I generally refer to the class of strategies that prescribeschoosing every available multiple of three, which guarantees victory, as the dominantsolution to G21.

G21 features two properties that make it well-suited for the purposes of this study.First, G21 features a dominant solution that acts as an identi�able and measurableproxy for optimal decision making in G21. Simultaneously, the dominant solutioneliminates the ambiguities in identifying optimal decision making that can arise fromdi¤erences in beliefs about the opponent�s strategy. Second, G21 is constant-sum andfeatures a binary outcome of �win�or �lose�. This eliminates possible ambiguitiesin identifying optimal decisions that can arise from e¢ ciency concerns, distributionalpreferences, and/or belief based motivations. Di¤erences in beliefs about others,and the presence of social preferences can lead to obvious confounds when tryingto identify optimal decision making in other frequently implemented experimentalgames, e.g., three or more players guessing (p-beauty contest) games, trust games,centipede games, and sequential bargaining games.8

In order to investigate strategic LBD and strategic LBO, it is crucial that thechosen game be complex enough to allow for learning. That is, G21 must be complexenough that most subjects playing G21 for the �rst time fail to play the dominantsolution. DSB (2010) �nd that roughly 85 percent of subjects playing G21 for the�rst time initially fail to play the dominant solution. Similarly, Gneezy et al. (2010)also �nd that most subjects initially fail to play the dominant solution in the relatedrace games they consider. However, in both studies, many subjects learn to play thedominant solution after several rounds of play. The previous experimental results

7Choosing every multiple of three does not describe a complete strategy, as it does not specify anaction for the agent at information sets where a multiple of three is not available. A complete strategymust specify an action for the second mover at these information sets, although these informationsets will not be reached if the second mover plays a dominant strategy of choosing every multipleof three. Any strategy that speci�es choosing every available multiple of three, regardless of whatis chosen when a multiple of three is not available, is dominant because it will guarantee victory forthe second mover. Thus, the second mover has many dominant strategies.

8See Nagel (1995), Du¤y and Nagel (1997), Bosch-Domenech et al. (2002), and Ho et al. (1998)for experimental studies using the guessing game. See Rosenthal (1981), McKelvey and Palfrey(1992), Fey et al. (1996), Rapoport et al. (2003), and Palacios-Huerta and Volij (2009) for ex-perimental studies using the centipede game. See Binmore et al. (1985), Ochs and Roth (1989),Harrison and McCabe (1992, 1996), Johnson et al. (2002), Binmore et al. (2002), and Carpenter(2003) for experimental studies of sequential bargaining.

6

Page 7: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

suggest that G21 is su¢ ciently complicated that most subjects initially fail to playa dominant solution, yet it is su¢ ciently straightforward to admit the possibility ofimproved decision making. Additionally, the sequential structure of G21, comparedto a simultaneous move game, allows for the construction of more robust measures ofdecision making quality. These measures enable me to better quantify the degree towhich subjects are deviating from optimal decision making.

G21 is a stylized strategic game, and this feature creates a clear cost of the ex-perimental design. Namely, the fact that G21 features a dominant solution may limitthe extent to which the experimental insights apply to a general class of strategicsettings. However, the stylized features of G21 create a clear bene�t of the experi-mental design. Namely, the fact that G21 features a dominant solution, is constantsum, and features a binary outcome enables me to identify, and measure, strategicdecision making quality. Thus, G21 provides a suitable platform for gaining initialinsights regarding how the decision making quality of an opponent a¤ects strategicLBD and strategic LBO.9

2.2 Experimental Treatments and Research Hypotheses

There were two possible types of computer opponents in the experiment, which I referto as the optimizing opponent and the naïve opponent. The optimizing opponent waspre-programmed to play the dominant strategy of choosing every available multipleof three, and randomly increment the count by 1 or 2 when a multiple of three wasnot available. The naïve opponent was pre-programmed to randomly increment thecount by 1 or 2 at every decision.10

I implement a between-groups design with four treatments. Each of the fourtreatments are described as follows:

Optimal Player The subject played six rounds of G21 against the optimizing op-ponent followed by six rounds of G21 against the naïve opponent.

Naïve Player The subject played all twelve rounds of G21 against the naïve oppo-nent.

9The use G21, in lieu of other variations of the general class of race games, was strictly a designchoice. All variations of the race game feature similar properties: a dominant solution, constant sum,and a binary outcome. Therefore, any other variation of the race game would have been equallysuitable.10Hall (2009) implements a similar design where subjects play variations of the race game against

a pre-programmed computer opponent. However, the motvation of Hall�s paper is to experimentallyinvestigate how mid-game teaser payments, both on and o¤ the backward induction path, a¤ect asubject�s ability to work out the backward induction solution. Hall only considers a pre-programmedopponent who always plays the dominant strategy when possible and randomizes otherwise, whichis analogous to what I refer to as the optimizing opponent.

7

Page 8: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

Optimal Observer The subject �rst observed a subject from the Optimal Playertreatment play the initial six rounds of G21, and then proceeded to play sixrounds of G21 against the naïve opponent.

Naïve Observer The subject �rst observed a subject from the Naïve Player treat-ment play the initial six rounds of G21, and then proceeded to play six roundsof G21 against the naïve opponent.

Before I proceed, let me �rst highlight the motivation to have all four treatmentsplay the �nal six rounds of G21 against the naïve opponent. First, the treatmentsonly di¤er in terms of the subjects experience during the �rst six rounds of play. Asa result, any di¤erences in decision making between the treatments in the last sixrounds can then be attributed to the subjects�experience in the �rst six rounds, thusisolating the treatment e¤ect. Second, playing the �nal six rounds against the naïveopponent allows for the construction of a variety decision making quality measures(de�ned and described in the next section) that can be used to compare decisionmaking across treatments.11 The multiple measures of decision making quality allowsme to more precisely quantify the extent that subjects are deviating from optimaldecision making, which provides a more robust conclusion about how the decisionmaking quality of they opponent impacts strategic LBD and strategic LBO.

Recall, that the �rst motivation of this study is to investigate whether (1) strate-gic LBD and (2) strategic LBO are more e¤ective when facing an optimal decisionmaking opponent compared to a sub-optimal decision making opponent. To answerthese questions, I use the experimental data to test the following two correspondinghypotheses: (H1) subjects from the Optimal Player treatment exhibit better decisionmaking than subjects from the Naïve Player treatment, and (H2) subjects from theOptimal Observer treatment exhibit better decision making than subjects from theNaïve Observer treatment.

The second motivation of this study is to compare the e¤ectiveness of strategicLBD and strategic LBO, for both the (3) optimizing opponent and the (4) naïve op-ponent. I use the experimental data to compare of strategic LBD and strategic LBOwith the following two corresponding hypotheses: (H3) subjects from the OptimalPlayer treatment exhibit equivalent decision making quality as subjects from the Op-timal Observer treatment, and (H4) subjects from the Naïve Player treatment exhibitequivalent decision making quality as subjects from the Naïve Observer treatment.11This is because in rounds where subjects play the optimizing opponent very little insight can

be gained regarding a subject�s decision making quality. In particular, when subjects act as the�rst mover against the optimizing opponent, they will not have an opportunity to play the dominantsolution. Similarly, when subjects act as the second mover and fail to choose a multiple of three,they will be unable to begin playing the dominant strategy in any later subgame. However, whensubjects act as the �rst or second mover and play against the naïve opponent, opportunities willlikely arise at various subgames, to begin playing a dominant strategy. Because the naive opponentrandomizes equally at every decision node, if a subject fails to choose an available multiple of three,then there is a 50% chance that a multiple of three will be available on the subjects next turn.

8

Page 9: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

2.3 Experimental Procedure

The subject pool for this experiment consisted of undergraduates from the Universityof Arizona. All of the sessions were conducted in the Economic Science Laboratory(ESL) at the University of Arizona. A total of four sessions were conducted using 96subjects. This was a computerized experiment, and the experimental software wasprogrammed in Microsoft Visual Basic. A copy of the experimental instructions andsample screen shots are presented in the Appendix. All subjects were informed thatthey were playing against a computer opponent, but received no information aboutthe strategy implemented by the computer.12

Twenty four subjects participated in each of the four treatments. In each exper-imental session, each player was instructed to sit at their assigned computer carrel,and each observer was instructed to stand quietly behind their assigned player. Anexperimenter was present to monitor all sessions and to ensure that there was nocommunication between players and observers. After reading the instructions, play-ers proceeded to play six rounds of G21 against their designated computer opponentwhile the observers merely observed the play.13 After observing the �rst 6 roundsof play, the observers were then instructed to move to their own carrel where theyproceeded to play six rounds of G21. The players then proceeded to play an addi-tional six rounds of G21 with no observer. Each subject began as the �rst mover andalternated between �rst mover and second mover in all subsequent rounds.

All subjects were paid a $5 USD show up fee. Additionally, the players received $1USD for every round they won for all twelve rounds. The observers were paid $1 USDfor every round that their corresponding player won during the �rst six rounds plus$1 USD for every round they won while playing during the last six rounds. Paying theobserver for the �rst six rounds while they observed mitigated any spiteful feelingsthat may arise from possible inequity in payments between player and observer. In

12In particular, subjects were not informed about whether they were facing the optimizing opponentor the naïve opponent. This was intended to help isolate the e¤ect of the opponent�s decision makingquality on strategic LBD and strategic LBO by eliminating any possible compound informationa¤ects that could arise by informing subjects about the decision quality of the opponent. It iscertainly possible that providing information about the decision making quality of the opponentcould a¤ect the results, as noted by an anonymous referee. However, absence of such informationabout the opponent�s decision making quality does not undermine or nullify the research hypothesesregarding how strategic LBD and strategic LBO are impacted by the actual decision quality of theopponent. Investigating how information regarding the opponent�decision quality a¤ects strategicLBD and strategic LBO could be an interesting area of future research, although beyond the scopeof the current study.13It is possible that having an observer can a¤ect the decision making of the player. However, an

observer was present in both the optimal and naïve treatment which controls for this possible e¤ect,if it exists. Additionally, there is little reason to think that if there is an observer e¤ect that it issystematically di¤erent between the treatments; therefore, any observer e¤ect is not likely to biasthe results when investigating the relative di¤erence between the optimal player and naïve playertreatments.

9

Page 10: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

addition, this controlled for the amount of money won in the �rst 6 rounds betweenthe player and observer. This ensured that any di¤erences in behavior between thetwo treatments was not systematically caused by di¤erences in monetary earningsduring the �rst 6 rounds. Each session lasted about thirty minutes and no subjectsparticipated in more than one session.

3 Results

Testing H1-H4 requires measuring the decision making quality of subjects playingG21. To help minimize measurement error and provide a more well-rounded androbust test of these research hypotheses, I consider several complementary measuresthat characterize decision making quality in G21. When testing the research hypothe-ses, I only compare decision making for the last 6 rounds of play (rounds 7-12), whenall subjects played G21 against the naïve opponent. Therefore, any di¤erences in de-cision making quality between the treatments in the last 6 rounds can be attributedto the treatment e¤ect; Namely, whether the subject initially played or observed,and whether the subject faced the optimizing opponent or naïve opponent. Addition-ally, all hypotheses are tested using aggregate decision making data (over the last 6rounds), which helps smooth any noise resulting from subject heterogeneity in regardsto inherent decision making aptitude.

The �rst measure of decision making quality I consider is an Error Free Game(EFG). I de�ne an EFG as a game of G21 where the subject chooses every availablemultiple of three.14 Choosing every available multiple of three in G21 guaranteesvictory. Given that, I will assume that playing an EFG serves as a proxy measure foroptimal decision making in G21, and playing more EFGs can be viewed as evidenceconsistent with better decision making in G21.15

However, a subjects may fail to play an EFG, but nevertheless, exhibit comparablybetter decision making in G21. For example, it seems reasonable to argue that asubject who failed to choose 3 (when it was available), but proceeded to choose everysubsequent available multiple of three in the game, exhibited better decision makingthan a subject who failed to choose all available multiples of three in a game. Inorder to compare decision making quality, in the absence of optimal play, I considertwo additional dynamic measures of decision making quality in G21.

14When subjects act as the second mover, an EFG corresponds to choosing every multiple of three,which is analogous to the measure of �perfect play�considered by DSB (2010).15Because playing an EFG requires choosing several sequential multiples of three, it is unlikely

that a subject would just randomly play a EFG. In order for a subject to play a EFG when they actas the second mover, they must choose 7 sequential multiples of three. The probability of playing aEFG if the subject was randomly incrementing the count by one or two is 1

27 � :008. In addition,there is no reason to suspect that the likelihood of randomly playing a EFG is di¤erent in the twotreatments.

10

Page 11: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

The second measure I consider is Error Rate (ER). I de�ne ER as the percentageof available multiples of three that a subject failed to choose. I assume that a failureto choose a multiple of three is a sub-optimal decision, and thus, ER serves as a�birds-eye view�measure for decision quality in G21. Hence, a lower ER can viewedas evidence of better decision making in G21. The reported ERs are calculated as thesubject level average ER over rounds 7-12. The third measure I consider is Last Error(LE). I de�ne LE as the last multiple of three, in a given game of G21, that a subjectexplicitly failed to choose. The LE measure is an element of the set {0, 3, 6, 9, 12,15, 18}.16 The LE measure is intended to proxy for the number of steps of reasoningexhibited by a particular subject in a game of G21, with lower LE levels indicatinghigher reasoning ability.17 Thus, a lower LE level can be viewed as evidence of betterdecision making in G21. The reported LE levels are calculated as the subject levelmedian LE over round 8, 10, and 12.

Testing H1-H4 are only possible conditional on the maintained assumptions thatG21 is su¢ ciently complex that (A1) most subjects initially fail to make optimaldecision, but (A2) su¢ ciently straightforward that learning is possible. Hence, theability to test how strategic LBD and strategic LBO are a¤ected by the decisionmaking quality of a subject�s opponent. To verify A1, I look at the data from round2. In round 2, only 5/24 (21%) subjects from Optimal Player treatment and 2/24(8%) subjects from the Naïve Player treatment played an EFG, which con�rms A1.18

To verify A2, I compare the decision making of subjects in the Naive Player treat-ment from rounds 1-6 with rounds 7-12. Table 1 reveals that subjects in the NaivePlayer treatment played signi�cantly more EFGs, had signi�cantly lower ERs, andhad signi�cantly lower LE levels in rounds 1-6 compared to rounds 7-12. The NaivePlayer treatment subjects exhibited better decision making in rounds 7-12 comparedto rounds 1-6, which provides evidence that learning is possible and veri�es A2. GivenA1 and A2 are satis�ed, it is possible to proceed with testing H1-H4.

16It is possible for a subject to fail to choose 21 when it is available. However, this only happened8 times in 576 games. It is likely that these few instances were accidental and resulted from thespeci�c software design of the user interface (see the Appendix for a sample screen shot). As a result,I drop all 8 cases from the analysis where LE = 21. This has no signi�cant impact on the hypothesestesting.17I stress the importance that LE is a proxy for number of steps of reasoning. Consider a subject

who fails to choose 9 and then just randomly choose 12, 15, 18, and 21. The subject may nothave exhibited any steps of reasoning, but nevertheless, LE = 12 is recorded for this subject. Oralternatively, suppose the �rst available multiple of three comes at 15. If the subject chooses 15, 18,and 21, then LE = 0 for that game which could bias the actual measure of LE downward. However,to help mitigate this possible bias, I only consider the LE for rounds 8, 10, and 12 when subjectsacted as the second mover. Additionally, I test the di¤erences in LE between treatments to smoothany residual measurement biases.18Round 2 is the �rst round when subjects from the Optimal Player treatment acted as the second

mover and, therefore, had an opportunity to play an EFG.

11

Page 12: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

Table 1Comparison of Naive Player data from rounds 1-6 and rounds 7-12

Naive Player

Measure Rounds 1-6 Rounds 7-12 (p-value)

EFGs 16/144 27/144 (0.049)(11%) (19%)

Mean ER 44% 36% (0.008)

Mean LE 12.00 9.00 (0.023)

Notes: The reported data is aggregated over rounds 1-6 and rounds 7-12, respectfully.EFG was test using a 1-sided Fisher�s Exact test and Mean ER and Mean LE were testusing a Wilcoxon matched-pairs signed-ranks test.

I �rst test H1, namely, subjects from the Optimal Player treatment exhibit betterdecision making than subjects from the Naïve Player treatment. Table 2 reports theaggregate comparison of EFGs, ERs, and LE levels between the two treatments.From Table 2 we see that the aggregate proportion of EFGs for the Optimal Playertreatment is 43/144 (30%) compared to 27/144 (19%) for the Naive Player treatment,which is signi�cantly di¤erent (p = 0.028). The mean ER is 29% for subjects in theOptimal Player treatment compared to 36% for the Naïve Player treatment, whichis signi�cantly di¤erent (p = 0.052). The mean LE and Median LE are 6.38 and4.50 in the Optimal Player treatment compared to 9.00 and 9.00 in the Naive Playertreatment, both of which are signi�cantly di¤erent (p = 0.056) and (p = 0.042),respectfully. Taken together, the data is consistent with H1, which suggests thatstrategic LBD can be more e¤ective when agents play an optimal decision makingopponent, relative to a sub-optimal decision making opponent.

I proceed by testing H2, namely, subjects from the Optimal Observer treatmentexhibit better decision making than subjects from the Naïve Observer treatment. Ta-ble 3 reports the aggregate comparison of EFGs, ERs, and LE levels between the twotreatments. The aggregate proportion of EFGs is 43/144 (30%) in the Optimal Ob-server treatment compared to 20/144 (14%) for the Naive Observer treatment, whichis signi�cantly di¤erent (p = 0.001). However, from Table 3 we can see that there isno signi�cant di¤erence in mean ER, mean LE, or median LE between the OptimalObservation treatment and the Naive Observation treatment. Taken together, thedata is largely inconsistent with H2, which suggests that strategic LBO is, at most,marginally more e¤ective when observing an optimal decision making opponent.

12

Page 13: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

Table 2Comparison of Optimal Player and Naive Player data (Rounds 7-12)

Measure Optimal Player Naive Player (p-value)

EFG 43/144 27/144 (0.028)(30%) (19%)

Mean ER 29% 36% (0.052)

Mean LE 6.38 9.00 (0.056)Median LE 4.50 9.00 (0.042)

Notes: EFG was tested using a 1-sided Fisher�s Exact test. Mean ER and Mean LEwere tested using a 1-sided Mann-Whitney U-test. Median LE was tested using ak-sample medians test.

Table 3

Comparison of Optimal Observer and Naive Observer data (rounds 7-12)

Measure Optimal Observer Naive Observer (p-value)

EFG 43/144 20/144 (0.001)(30%) (14%)

Mean ER 35% 38% (0.368)

Mean LE 10.50 10.38 (0.651)Median LE 12.00 12.00 (0.654)

Notes: EFG was tested using a 1-sided Fisher�s Exact test. Mean ER and Mean LEwere tested using a 1-sided Mann-Whitney U-test. Median LE was tested using ak-sample medians test.

13

Page 14: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

To test H3 and H4, I compare the decision making between players and observersfor both the optimizing opponent and naïve opponent. Table 4 reports the relevantcomparisons of the aggregate EFGs, ERs, and LE levels. Looking �rst at when sub-jects faced the optimizing opponent, Table 4 reveals that the aggregate proportionsof EFGs is 43/144 (30%) for both the player subjects and observer subjects, which isclearly not signi�cantly di¤erent. Similarly, the Mean ER is 29% for the player sub-jects compared to 35% for the observer subjects, which is not signi�cantly di¤erent (p= 0.269). However, mean LE of 6.38 and median LE of 4.50 for the player subjects aresigni�cantly di¤erent from the mean LE of 10.50 and median LE of 120.00 for observersubjects, (p = 0.034) and (p = 0.043) respectfully. Looking next at when subjectsfaced the naïve opponent, Table 4 reveals that there are no signi�cant di¤erences inEFGs, ERs, or LE levels between the player subjects and observer subjects.

The data reveals very little di¤erence in decision making between player subjectsand observer subjects when facing the naive opponent, which is consistent with H4.The data also reveals that there are some di¤erences in the decision making betweenplayers and observers when facing an optimizing opponent. Speci�cally, players havea signi�cantly lower LE levels compared to observes when facing the optimizing op-ponent. Taken together, the results suggest that strategic LBD and strategic LBOare comparable in terms of their e¤ectiveness at improving strategic decision makingin G21 when facing a sub-optimal decision making opponent; There is some evidencethat strategic LBD may be marginally more e¤ective than strategic LBO when facingan optimal decision making opponent.

Table 4Comparison of Players and Observers (rounds 7-12)

Optimizing Opponent Naïve Opponent

Measure Player Observer (p-value) Player Observer (p-value)

EFG 43/144 43/144 (1.000) 27/144 20/144 (0.339)(30%) (30%) (19%) (14%)

Mean ER 29% 35% (0.269) 36% 38% (0.975)

Mean LE 6.38** 10.50 (0.034) 9.00 10.38 (0.399)Median LE 4.50** 12.00 (0.043) 9.00 12.00 (0.768)

Notes: EFG was tested using a 2-sided Fisher�s Exact test. Mean ER and Mean LE weretested using a 2-sided Mann-Whitney U-test. Median LE was tested using a k-samplemedians test.

14

Page 15: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

4 Conclusion

The motivation of this paper was twofold: The �rst was to experimentally testwhether strategic LBD and strategic LBO are more e¤ective when agents face an op-ponent who exhibits optimal decision making, relative to sub-optimal decision mak-ing. The experimental data reveals that subjects who initially played G21 againstthe optimizing opponent exhibited signi�cantly better decision making than thosesubjects who initially played G21 against the naïve opponent. While, subjects whoinitially observed the optimizing opponent exhibited, at most, only marginally betterdecision making to those subjects who initially observed the naïve opponent. Theseresults suggest that strategic LBD can be more e¤ective when agents play an opti-mal decision making opponent relative to a sub-optimal decision making opponent.Whereas, the results suggests that strategic LBO may only be, at most, marginallymore e¤ective when agents observe an optimal decision making opponent relative toa sub-optimal decision making opponent.

The second motivation was to experimentally compare the e¤ectiveness of strate-gic LBD with strategic LBO, for both an optimal and sub-optimal decision makingopponent. The data reveals no signi�cant di¤erence in decision making quality be-tween subjects who initially played G21 and subjects who initially observed the playof G21 when facing the naïve opponent. When facing the optimizing opponent, thedata reveals some signi�cant di¤erences between subjects who initially played andsubjects who initially observed, which are in the direction of better decision makingfor the subjects who played. These results suggest that strategic LBD and strategicLBO can be comparably e¤ective when facing a sub-optimal decision making oppo-nent. While, strategic LBD can be marginally more e¤ective than strategic LBOwhen facing an optimal decision making opponent.

By comparison, Merlo and Schotter experimentally compare LBD and LBO in asingle-agent pro�t maximization problem and �nd evidence that LBO outperformsLBD. Because I consider a 2-player strategic game, the players are necessarily ableto observe the decision making of the opponent. Thus, there is an element of obser-vational learning, from observing the opponents decisions, even for agents who playthe game. In a broad sense, you can think of the players in a game as getting to playand observe, while observers of a game only get to observe. In light of the results ofMerlo and Schotter, this might explain why I �nd that strategic LBD is at least ase¤ective as strategic LBO. Investigating which settings are more conducive to LBDand LBO, and the reasons why, remain open questions for future research.

Before I conclude, I speculate about two plausible mechanisms that might beresponsible for the better decision making exhibited in G21 by subjects who playedthe optimizing opponent, compared to the naive opponent. The �rst is that subjectsare implementing some kind of reinforcement learning mechanism (Erev and Roth(1998), and Camerer and Ho (1999)). When subjects play G21 against the optimizing

15

Page 16: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

opponent and fail to play the dominant solution, they lose the game. Whereas,subjects who play G21 against the naive opponent and fail to play the dominantsolution can still win the game. Thus, deviations from optimal decision making aremore costly when facing the optimizing opponent because they ultimately result inlosing the game. As a result, strategies that do not prescribe playing the dominantsolution will be negatively reinforced by subjects playing the optimizing opponent,which will lead subjects to converge to the dominant solution. If it is the case thatsub-optimal strategic decisions by an agent are exploitable by an opponent, thenin these strategic settings, a reinforcement learning mechanism would increase thee¤ectiveness of strategic LBD when playing an optimal decision making opponent.

The second plausible explanation for better decision making exhibited by subjectswho play against the optimizing opponent, compared to the naïve opponent, is subjectsare implementing some kind of imitation learning mechanism (Vega-Redondo (1997)and Schlag (1999)).19�20 The optimizing opponent was pre-programmed to play thedominant solution, and would win every time it acted as the second-mover, or thesubject failed to play the dominant solution. Hence, imitation would result in subjectsplaying the optimizing opponent to converge on the dominant solution more quicklythan subjects playing the naive opponent. If the strategic setting is symmetric, thenan imitation learning mechanism would increase the e¤ectiveness of strategic LBDwhen playing an optimal decision making opponent. Identifying why strategic LBD ismore when the opponent makes better decisions, and under what strategic conditions,could be an area for future research.

Although the strategic setting considered in this study was abstract and atypicalof many strategic settings in the �eld, a few important insights can be gleaned. First,strategic LBD can be more e¤ective when facing an opponent who makes good de-cisions. Thus, agents who are motivated to become better strategic decision makersas quickly and e¤ectively as possible, through repeated play, should consider playingagainst good decision making opponents. Second, strategic LBO can possibly be, atmost, marginally less e¤ective than strategic LBD in strategic settings. This may beparticularly relevant in complex settings where strategic LBD is potentially costly,as it may behoove an agent to gather knowledge by initially observing, in lieu of do-ing. Hence, apprenticing �rst can be just as e¤ective for becoming a better strategicdecision maker, and potentially less costly.

19Vega-Redondo (1997) and Schlag (1999) develop formal models of imitation learning whereagents imitate the actions of others who obtain higher payo¤s. Both of these models are developedin a static framework. However, G21 is a sequential game and the complete strategy of the opponentis not observed upon completion of the round. Therefore, agents playing G21 would be unable toimitate the strategy of the opponent in G21 as speci�cally modeled by Vega-Redondo (1997) andSchlag (1999). In that regard, it may be more pedagogical to refer to this type of mechanism in G21as �mimicking�, akin to DSB (2010).20Experimental evidence consistent with imitation has been subsequently documented, e.g., Huck

et al. (1999, 2000), O¤erman et al. (2002), and Apesteguia et al. (2007).

16

Page 17: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

5 Appendix

5.1 Experimental Instruction for PLAYER

This is an experiment in strategic decision making. Please read these instructionscarefully and pay attention to any additional instructions given by the experimenter.You have the potential to earn additional compensation and the amount depends onthe decisions that you make in this experiment.You have been assigned the PLAYER role. You will be playing 12 rounds of a

simple two player game that will be described below. You will be playing against aprogrammed computer opponent for all 12 rounds. In addition, for the �rst 6 rounds,an observer will be standing behind you watching you play against the computer.You are not permitted to communicate with the observer in any capacity. After 6rounds, the observer will be instructed to move to an empty computer carol and youwill be instructed to play 6 more rounds.After you �nish playing all 12 rounds, please remain seated and an experimenter

will come by and pay you your additional compensation. In addition to the $5 show-up fee, you will be compensated $1 for each round that you win. The observer willalso be compensated $1 for each round that you win during the �rst 6 rounds. Afteryou have been paid, please quietly leave the laboratory. If you have any questions atany time, raise your hand and an experimenter will be by to answer your question.

5.2 Experimental Instruction for OBSERVER

This is an experiment in strategic decision making. Please read these instructionscarefully and pay attention to any additional instructions given by the experimenter.You have the potential to earn additional compensation and the amount depends onthe decisions that you make in this experiment.You have been assigned the OBSERVER role. You will �rst be watching 6 rounds,

and then be playing 6 rounds of a simple two player game that will be described below.For the �rst 6 rounds, you will be standing behind a player watching him/her playagainst the programmed computer opponent. You are not permitted to communicatewith the player in any capacity. After 6 rounds, you will be instructed to move toan empty computer carol where you will then play 6 rounds of the game against aprogrammed computer opponent.After you �nish playing the last 6 rounds, please remain seated and an experi-

menter will come by and pay you your additional compensation. In addition to the $5show-up fee, you will be compensated $1 for each round that you win while you areplaying, and $1 for each round that the player wins while you are observing. Afteryou have been paid, please quietly leave the laboratory. If you have any questions atany time, raise your hand and an experimenter will be by to answer your question.

17

Page 18: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

5.3 Screen Shot of G21 Instructions

18

Page 19: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

5.4 Sample Screen Shot of G21 Interface

19

Page 20: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

References

[1] Apesteguia, J., Huck, S., Oechssler, J., 2007. Imitation �Theory and Experi-mental Evidence. Journal of Economic Theory 136, 217-235.

[2] Arrow, K., 1962. The Economic Implications of Learning by Doing. The Reviewof Economic Studies 29, 155-173.

[3] Bikhchandani, S., Hirshleifer, D., Welch, I., 1998. Learning from the Behavior ofOthers: Conformity, Fads, and Informational Cascades. The Journal of EconomicPerspectives 12, 151-170.

[4] Binmore, K., Shaked, A., Sutton., 1985. Testing No cooperative Bargaining The-ory: A Preliminary Study. American Economic Review 75, 1178-1180.

[5] Binmore, K., McCarthy, J., Ponti, G., Samuelson, L., Shaked, A., 2002. A Back-ward Induction Experiment. Journal of Economic Theory 104, 48-88.

[6] Bosch-Domenech, A., Montalvo, J., Nagel, R., Satorra, A., 2002. One, Two,(Three), In�nity, . . . : Newspaper and Lab Beauty Contest Experiments. Amer-ican Economic Review 92, 1687-1701.

[7] Camerer, C., Ho, T-H., 1999. Experienced-Weighted Attraction Learning in Nor-mal Form Games. Econometrica 67, 827-874.

[8] Carpenter, J., 2003. Bargaining Outcomes as a Result of Coordinated Expec-tations: An Experimental Study of Sequential Bargaining. Journal of Con�ictResolution 47, 119-141.

[9] Chou, E., McConnell, M., Nagel, R., Plott, C., 2009. The control of Game FormRecognition in Experiments: Understanding Dominant Strategy Failures in aSimple Two Person �Guessing�Game. Experimental Economics 12, 159-179.

[10] Dixit, A. 2005. Restoring Fun to Game Theory. Journal of Economic Education36, 205-219.

[11] Du¤y, J., Nagel, R., 1997. On the Robustness of Behavior in Experimental�Beauty Contest�Games. The Economic Journal 107, 1684-1700.

[12] Dufwenberg, M., Sundaram, R., Butler, D., 2010. Epiphany in the Game of 21.Journal of Economic Behavior and Organization.

[13] Dursch, P., Kolb, A., Oechssler, J., Schipper, B., 2010. Rage Against the Ma-chines: How Subjects Learn to Play Against Computers. Economic Theory 43,407-430.

20

Page 21: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

[14] Erev, I., Roth, A., 1998. Predicting How People Play Games: ReinforcementLearning in Experimental Games with Unique, Mixed Strategy Equilibrium. TheAmerican Economic Review 88, 848-881.

[15] Fey, M., McKelvey, R., Palfrey, T., 1996. An Experimental Study of Constant-Sum Centipede Games. International Journal of Game Theory 25, 269-289.

[16] Gneezy, U., Rustichini, A., Vostroknutov, A., 2010. Experience and Insight inthe Race Game. Journal of Economic Behavior and Organization 75, 144-155.

[17] Grosskopf, B., Nagel, R., 2008. The Two Person Beauty Contest. Games andEconomic Behavior 62, 93-99.

[18] Hall, K. 2009. A Pure Test of Backward Induction. Doctoral Dissertation: Uni-versity of Tennessee.

[19] Harrison, G., McCabe, K., 1992. The Role of Experience for Testing BargainingTheory in Experiments. In: Isaac, M. (Ed.). Research in Experimental Eco-nomics, Vol. 5. Greenwich: JAI Press, 137-169.

[20] Harrison, G., McCabe, K., 1996. Expectations and Fairness in a Simple Bargain-ing Experiment. International Journal of Game Theory 25, 303-327.

[21] Hastie, R., 1986. Review essay: experimental evidence on group accuracy. In:Owen, G., Grofman, B. (Eds.). Information Pooling and Group Decision Making.Westport, CT: JAI Press, 129-57.

[22] Hill, G., 1982. Group versus individual performance: Are N + 1 heads betterthan one. Psychological Bulletin 91, 517-539.

[23] Ho, T-H., Camerer, C., Weigelt, K., 1998. Iterated Dominance and Iterated BestResponse in Experimental �p-Beauty-Contests�. American Economic Review 88,947-969.

[24] Huck, S., Normann, H.T., Oechssler, J., 1999. Learning in Cournot Oligopoly:An Experiment. Economic Journal 109, C80-C95.

[25] Huck, S., Normann, H.T., Oechssler, J., 2000. Does Information About Com-petitors�Actions Increase or Decrease Competition in Experimental OligopolyMarkets? International Journal of Industrial Organization 18, 39-57.

[26] Jovanovic, B., Nyarko, Y., 1995. The Transfer of Human Capital. Journal ofEconomic Dynamics and Control 19, 1033�1064.

[27] John, E.R., Chesler, P., Bartlett, F., Victor, I., 1969. Observational learning incats. Science 166, 901�903.

21

Page 22: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

[28] Johnson, E., Camerer, C., Sen, S., Rymon, T., 2002. Detecting Failures of Back-ward Induction: Monitoring Information Search in Sequential Bargaining. Jour-nal of Economic Theory 104, 16-47.

[29] Laughlin, P., 1980. Social combination processes of cooperative problem-solvinggroups on verbal intellective tasks. In: Fishbein, M. (Ed.). Progress in SocialPsychology. Hillsdale, NJ: Erlbaum, 127-155.

[30] Laughlin, P., Adamopoulos, J., 1980. Social combination processes and individuallearning for six-person cooperative groups on an intellective task. Journal ofPersonality and Social Psychology 38, 941-947.

[31] Laughlin, P., Bonner, B., Miner, A., 2002. Groups perform better than the bestindividuals on letters-to-numbers problems. Organizational Behavior and HumanDecision Processes 88, 605-620.

[32] Laughlin, P., Ellis, A., 1986. Demonstrability and social combination processeson mathematical intellective tasks. Journal of Experimental Social Psychology22, 177-189.

[33] Levine, J., Moreland, R., 1998. Small groups. In: Gilbert D., Fiske, S., Lindzey,G. (Eds.). The Handbook of Social Psychology 4th Edition, Vol. 2. New York:McGraw-Hill, 415-69.

[34] Levitt, S., List, J., 2007. What Do Laboratory Experiments Measuring SocialPreferences Reveal About the Real World. Journal of Economic Perspectives 21,153-174.

[35] Levitt, S., List, J., Sado¤, S., Checkmate: Exploring Backward Induction AmongChess Players. Forthcoming in American Economic Review.

[36] McKelvey, R., Palfrey, T., 1992. An Experimental Study of the Centipede Game.Econometrica 60, 803-836.

[37] Merlo, A., Schotter, A., 2003. Learning by not Doing: An Experimental Investi-gation of Observational Learning. Games and Economic Behavior 42, 116-136.

[38] Nadler, J., Thompson, L., Van Boven, L., 2003. Learning Negotiation Skills: FourModels of Knowledge Creation and Transfer. Management Science 49, 529-540.

[39] Nagel, R., 1995. Unraveling in Guessing Games: An Experimental Study. Amer-ican Economic Review 85, 1313-1326.

[40] Ochs, J., & Roth, A., 1998. An Experimental Study of Sequential Bargaining.American Economic Review 79, 355-384.

22

Page 23: Learning to Make Better Strategic Decisionsecardell/Papers/Learning_to_make...Learning to Make Better Strategic Decisions Eric Cardellay September 25, 2011 Abstract Strategic settings

[41] O¤erman, T., Potters, J., Sonnemans, J., 2002. Imitation and Belief Learning inan Oligopoly Experiment. Review of Economic Studies 69, 973-997.

[42] Palacios-Huerta, I., & Volij, O., 2009. Field Centipedes. The American EconomicReview 99, 1619-1635.

[43] Rapoport, A., Stein, W., Parco, J., Nicholas, T., 2003. Equilibrium Play andAdaptive Learning in a Three-Person Centipede Game. Games and EconomicBehavior 43, 239-265.

[44] Rosenthal, R., 1981. Games of Perfect Information, Predatory Pricing, and theChain-Store Paradox. Journal of Economic Theory 25, 92-100.

[45] Schlag, K., 1999. Which One Should I imitate? Journal of Math and Economics31, 493-522.

[46] Shachat, J., Swarthout, T., 2004. Do we detect and exploit mixed strategy playby opponents. Mathematical Methods of Operations Research 59, 359-373.

[47] Thompson, P., 2010. Learning by Doing. In: Hall, B., Rosenberg, N. (Eds.).Handbook of the Economics of Innovation, Vol. 1. Amsterdam: Elsevier/North-Holland, 429-476.

[48] Tomasello, M., Davis-Dasilva, M., Camak, L., & Bard, K. (1987). �ObservationalLearning of Tool-Use by Young Chimpanzees,�Human Evolution 2, 175-183.

[49] Terkel, J., 1996. Cultural transmission of feeding behavior in black rats (rattusrattus). In: Heynes, C., Galef Jr., B. (Eds.). Social Learning in Animals: TheRoots of Culture. San Diego: Academic Press, 17-45.

[50] Vega-Redondo, F., 1997. The Evolution of Walrasian Behavior. Econometrica65, 375-384.

23