b c d c (5,5) (-10,20) a d (20,-10) (-5,-5)

Rational Games

‘Rational’ stable and unique solutions for multiplayer sequential games.

Richard ShiffrinMichael Lee

Shunan Zhang

draft—research in progress

• The topic is enormous, overlapping psychology, philosophy, economics, computer science, business, mathematics, logic, political science, and more.

• We have not yet been able to discover whether the present research is a rediscovery—let us know if it is

• (If it it is old, at least we have rediscovered something interesting).

• The first author has given talks for the last decade arguing that rationality is not a normative concept but rather a social consensus of a sufficiently large proportion of humans judged to be sufficiently expert.

• One of many examples illustrating this idea is found in multi-agent games, in which normative game theory stipulates decisions that harm all players, when players could all gain by other decisions.

• We argue that multiplayer games have rational ‘solutions’, based on players who are selfish and rational but take into account the fact that the other players are are also selfish and rational.

• These solutions are not Nash equilibria. Our solutions will find weak and strong Pareto equilibria when they exist (although in most cases these will not exist).

B

C D

C (5,5) (-10,20)

A

D (20,-10) (-5,-5)

One example is the well known ‘prisoners dilemma’: Players A and B independently decide to cooperate C or defect D. The table gives the payoffs for A and B in that order:

Whatever B chooses, A gains by defecting (similarly for B). Hence (-5,-5) is the Nash Equilibrium. But clearly (5,5) is better for both—the Pareto equilibrium. We argue rational players should cooperate (even many presumably not-so-rational human players do).

Some might argue it is rational for A to defect, even if B chooses to be rational and cooperate. But if it is rational to defect, both players know this and both will do so.

Doug Hofstadter has called this reasoning ‘super-rationality’. (And there are many other similar ideas; see David Gauthier in Philosophy, and much more).

[Of course real players can’t usually assume the other player is rational, making it difficult or impossible to reach a rational solution.]

• The difficulties faced by real players aside, it is still useful to develop an algorithm for reaching a jointly rational solution.

• Such a solution can serve as a normative baseline against which performance might be measured, and could conceivably serve as a starting point for bargaining.

(0,0)

A A

(9,9) (-1,10)

B B

The first number is the payoff for A and the second is the payoff for B.

One might argue B should choose (-1, 10) if given a choice, because B actsSelfishly and prefers 10 to 9. But if this is indeed rational A will know this and will selfishly choose (0,0). This is rational for neither because it is dominated by (9,9). Thus (9,9) is the rational game solution. What prevents B from ‘defecting’ if B is given the choice? If defection is rational, then A knows this and will choose (0,0).

Similar to PD but less obvious is a two-trial two-player centipede game:

• Even before presenting an algorithm to find a rational solution, one must do one’s best to establish the rationality of our premises, a highly controversial matter. Covering every aspect of this matter could easily merit a long book.

• Instead we present one argument that is sometimes effective.

• Suppose there are N players, each playing one trial of the centipede game against him or herself. They do this under the drug Midazolam that leaves reasoning intact but prevents memory of what decision had been made at the first decision point. Each player is told to adopt a Player 1 strategy that will maximize the number of times they get a Player 1 payoff better than the other people when they are Player 1. Similarly each player adopts a Player 2 strategy to maximize the number of times they get a Player 2 payoff that is better than the other people when they are Player 2.

• IMPORTANT: When you are Player 2, you must decide what you will do before you know whether you will get the opportunity to play--You must make a conditional decision: If I get to play what will I do? It could well be that you would decide not to play when Player 1, so you might never get a chance to give a Player 2 decision.

• When you are player 1, your goal is to maximize your player one result (although of course you should consider that when player 2 you will be trying to maximize your player 2 result).

• And similarly when you are player 2.

• (Your goal is not to maximize the sum of the two outcomes, though such an outcome is not precluded if it happens).

• So what strategies do you choose? When 1, you can get -1, 0, or 9. If you play when Player 1, and also cooperate when Player 2, then you will get 9 and at least tie for best.

• However, if you think you will decide to defect when Player 2 you would be best off not playing at all (-1 would tie for worst).

• When Player 2 you can get 0,9, or 10. If you get a chance to play you can cooperate and get 9 or defect and get 10. If you cooperate you will lose to those who play at step 1 and defect at step 2.

• • But which other players would do that? Players

who think defection is rational will not play at step 1. Thus how can you lose by cooperating?

• You can reason this before playing as Player 1, and hence can confidently play, and be reasonably certain you will later cooperate.

• What is it about playing one’s self that makes cooperation at step 2 seem rational? The key is the correlation between the decisions made at both steps. You are able to assume that whatever you decide before play 1 about what you will later decide as Player 2 will in fact come to pass.

• The assumption of rational players makes this correlation even stronger—if a set of decisions is rational all rational players will adopt them.

• It is of course just this correlation between decisions of multiple players that is ignored when arriving at Nash equilibria, and other seemingly irrational decisions.

• Thus when we say all players are assumed rational (not very satisfactory if we have not defined rationality), we are more importantly saying that the decisions of the players are positively correlated.

• There are many reasons why decisions could and should be correlated—social norms, (playing one’s self), group consensus, reliance on experts, and so on.

• Interestingly, if the expert community defines rationality in a way that makes defection rational, then assuming the opponents to be rational would lead to defection, not cooperation. Perhaps this occurred in the early days of game theory.

• Many other arguments can be given for (and possibly against) the rationality of ‘selfish’ cooperation, but for the rest of this presentation, we assume such rationality and see where it leads us.

• We would like to find an algorithm for finding a general rational solution for multiplayer decision games.

• One caveat:• Let us start by letting all decisions be sequential, not

simultaneous.

Why? Simultaneous decisions can lead to ambiguous decision making-- there are examples for which there is no unique rational solution. (Examples omitted). Thus we begin with sequential decisions, for which there is always a solution for the case of two players.

• So consider sequential decision games. These can be written as tree structures, as will be illustrated by examples.

A

B B B

A A A A A A A

etc

• Each player takes turns making choices, sequentially, in specified and known order. The game can be written as a tree, with each terminal node j giving numerical outcomes, (vj1, vj2, …, vjn) for players 1, 2, .., n.

• We will assume that all outcomes at terminal nodes for a given player differ from each other (no ties among any outcomes for a given player).

• Thus every decision path is uniquely identified by the joint payoff outcome at the termination of the decision path, sometimes termed a ‘solution’.

• There are two basic games: • Quantitative games:

– All players are assumed to know the utility of all outcomes for all players (perhaps equal), so outcomes can be compared quantitatively (this assumption is of course unlikely). Many unsolved problems exist in this case, including the role of ‘threats’. Thus we consider only:

• Ordinal (Qualitative) games:– Each player is assumed to prefer more rather than

less, but quantitative differences are otherwise irrelevant. Thus the outcomes for any player only matter ordinally, and quantitative comparisons of outcomes across players are irrelevant. E.g. one player’s outcomes can be exponentiated and nothing would change.

• Some player principles:

• Each player is ‘selfish’, always maximizing individual gain, in the certain knowledge that other players have the same goal, and that all players are fully ‘rational’ in their own self interest.

• The players do not have prior agreements. They know nothing about the other players other than all players are selfishly rational. (But all know that other’s utilities are unknown, so only ordinality matters).

• Under these assumptions it is clear that one type of rational solution is the Pareto maximum: All players prefer outcome (vj1, vj2,…vjn) to (vk1, vk2, ….vkn) if vj1>vk1, vj2>vk2,…., vjn>vkn.

• [Although this Pareto criterion sounds trivial but is not, as we saw in the examples of ‘prisoner’s dilemma, and the ‘centipede game’.]

• We will see that there are many other rational solutions that are not Pareto optimal (not all players gain).

• Some other examples help set the scene.

• Let there be only two players, A and B.

In the figures and examples to follow, let the first listed outcome at a terminal node be A’s payoff, and the second listed be B’s payoff.

(10,10) (15,5)

A A

Here A will choose (15,5). A has control and is selfish.

A B A B

(10,10)

A A

(15,5) (7,7)

B B

Here A will choose the left branch, getting 10, because A knows B will maximize her gain by choosing (7,7). A prefers 10 to 7.

A A

(3,15) (7,7)

B B

(10,10) (5,12)

B B

A controls the first choice and B controls the second choice. (5,12) is the rational joint choice: A knows B willchoose selfishly and therefore chooses the left branch, giving A5 rather than 3. A would only choose the right branch if B wouldmake a choice giving A more than 5. But (7,7) gives B less than15, so B would not do so.

• But there are of course cases where cooperation is rational. We have discussed the prisoners dilemma and the centipede game.

A A

(3,15) (7,7)

B B

(10,10) (5,12)

B B

If B chooses selfishly, then A would choose between (5,12), (3,15) and (6,1). Of these choices A would choose (6,1). However, both (10,10) and (7,7) are better for both players. Since (10,10) dominates, this is the rational solution.

(6,1)

A

• What we see here is the player with a choice at a node in the decision tree choosing selfishly among the ‘rational’ solutions for the subtrees at that node.

• This provides a ‘provisional’ solution and better alternatives are those that jointly benefit both players compared with the provisional solution.

• A potential problem arises when there is a choice of better alternatives.

A A

(3,15) (11,9)

B B

(10,10) (5,12)

B B

Now the choices dominating (6,1) are (10,10), preferred by B, and (11,9), preferred by A. One might argue that B should win, because B could ‘threaten’ to choose (3,12) if sent down the middle path by A.

The problem with ‘threats’ of this sort is that they generally lead to cycles of threats, no solution, or a poor solution.

(6,1)

A

A A

(3,15) (11,9)

B B

(10,10) (12,-1)

(5,12)

B B

(6,1)

A

ANow B could threaten (3,15) if sent down the middle path, but if A goes left at the first choice, and B goes left, then A could threaten to choose (12,-1). Such threats would cycle endlessly. In large decision trees, there will almost always be such competing ‘threats’ and hence no rational solution.

• One could argue that two rational players should know that solutions shouldbe restricted to those alternatives that jointly benefit both relative to the provisional selfish solution. If so then one can prune the decision tree and re-run the algorithm to find the rational solution.

• Meta-reasoning provides another basis for reaching this answer: Rational players would want a decision strategy that would give them the best chance of reaching a solution benefitting them, and would want an approach that would do so in as many decision trees as possible.

• In all two player games, the ‘successive pruning’ strategy will converge on a unique rational solution.

• Note: Because it limits consideration to jointly better alternatives, it does not allow the types of ‘threats’ described in earlier slides.

• The method is simple:

• We start at the terminal nodes and work upwards, establishing rational solutions at each higher node in turn.

• At a given node in this process, we establish a provisional solution based on a selfish choice among that player’s choices, where each such choice is the rational solution already established for each choice node.

1. Identify all alternatives in the subtree that give both players more than the provisional solution.

2. Prune the decision tree to contain just these alternatives.

3. Iterate the general algorithm on the pruned tree, from the terminal nodes up to the root of the subtree.

4. Continue until convergence.

• This method establishes a rational solution for a given node in the decision tree.

• We then move up a level in the tree. We first establish rational solutions for all

subtrees below the new node.

The new node has a provisional solution determined by a selfish choice among these subtree solutions, and the whole process iterates.

A A

(3,15) (11,9)

B B

(10,10) (5,12)

B B

In the pruned tree (below) A chooses (11,9) over (10,10) and that is the answer.

(6,1)

A

A

(6,1)

B B

(10,10) (11,9)

The choice (11,9) becomes the new provisional solution, but no alternative is left that dominates this so that is the final answer.

A A

(3,15) (11,9)

B B

(10,10) (12,-1)

(5,12)

B B

(6,1)

A

A

A

(6,1)

B

(11,9)

B

A

(10,10)

Here (11,9) wins. But if we changed (12,-1) to (12,2) then (12,2) would win:

A(6,1)

B B

(10,10) (12,2)

A (11,9)

• These examples and the proposed algorithm make it clear how to find a rational solution in games with two players.

• We are working to formalize these arguments and the resultant algorithm, but it should be clear how it works.

• We have assumed ordinality.

• Suppose all utilities of all players are known to all players (unreasonable, but let us accept this conditionally for now).

• In one sense, nothing changes—we still have an ordering of preferences for each player (according to each player’s utilities), so one could argue the same rational solutions would appear.

• Is it rational for all players to make a common assumption about utilities, an assumption that would allow a rational solution to be found?

• For example, the players could assume that the utilities for all players are equal.

• But more is needed, because the exact common utility function would be needed. Although it is known that utilities are non-linear, perhaps an assumption of linearity could nonetheless be assumed?

• This chain of assumptions is difficult to defend, but even it such assumptions were rational, quantitative games raise the question of ‘threats’.

• The possibility of ‘threats’ can appear: All players can imagine that a given player will accept a small loss to punish a sufficiently greedy other player.

The rational solution is (500,6): A goes right knowing B will then choose the larger payoff. But if utilities are known, B might think to accept a dollar loss and choose (0,5). If so, A would lose the 100 available from the left branch. This loss is high enough to ‘force’ A to choose (100,100). B (and A) reason that B’s dollar loss of one unit at the second node is outweighed by A’s loss of 100. A prefers 100 to 0 more than B prefers 6 to 5. If threats are ‘agreed’ to be rational then both players would know this and it might become rational for A to choose the left branch.

o

o

A A

B B

(100,100)

(0,5) (500,6)

• This kind of reasoning may indeed be used by humans, but whether it is rational remains an open question. Even if rational, the assumption that all utilities are communally known is very unlikely, if not impossible. If utilities are not known exactly, then we are thrown into some sort of Bayesian-like situation in which a prior over utilities must be guessed, and regardless of what sort of decision metric is assumed, guessing insures that a fixed normative rational solution will not exist.

Empirical studies of human decision making: • Humans attempting to decide rationally will likely

be affected by quantitative differences, especially when these become large.

• Thus, the normative ordinal solution we have introduced will act best as a baseline for human performance if a human study uses a design that makes payoffs as ordinal as possible.

• Perhaps payoffs for both payers could be made linear, placed on a common scale, and shrunk to a small range:

• E.g. Each player could see payoffs such as 10.1, 10.2, 10.3 etc.

Extensions to Multiple Players

(>2)

• The two player algorithm provides ideas that are useful when considering the much more complex case of three or more players.

• Unfortunately, when there are three of more players there are cases without rational solutions.

• Even when solutions exist, extensions to three+ players in a general decision tree setting is very difficult because there is a super-exponential explosion of possibilities.

• Other complexities are due to the fact that different subgroups of players can have control and influence at different points in the tree.

• Before turning to algorithms for some cases with rational solutions, let me show the problem that produces cases without rational solutions.

• Cases without solutions occur when there are unavoidable cycles among competing solutions.

B (30,130,230)

A

(40,140,210)

C

(50,120,220) (10,110,240)(20,150,250)

Here is a simple cycle, without a rational solution:

B prefers (20,150,240) and C prefers (10,110,240) so the provisional at A is (30,130,230). There are two alternatives giving A more than 30. A could go left, and B would be happy with (40,140,210). But C would then ‘offer’ A more: (50,120,220). So A would go right, and A and C would be happy. But (50,120,220) is worse than (30,130,230) for both B and C, so they would revert to the provisional. And so on ad infinitum. We have a perfect cycle, where each group of two players can control a choice giving those two more, but no one dominates. (Think of ‘rock paper scissors’).

B (30,130,230)

A

(40,140,210)

C

(50,120,220) (10,110,240)(20,150,250)

A+BB+CA+C

(40,140,210)(30,130,230)(50,120,220)

B (30,130,230)

A

(40,140,210)

C

(50,120,220) (10,110,240)(20,150,250)

Suppose we try our two player solution and prune the tree. This leaves A with the solution (50,120,220). One problem is that here the alternative (50,120,220) is at C’s node, but favors B, not C. C would never ‘offer’ this choice to A, preferring the provisional. C would only offer it to protect against (40,140,210).

B (30,130,230)

A

(40,140,210) (35,135,235)

C

(50,120,220) (10,110,240)(20,150,250)

Could we perhaps revert to the provisional in such cases? Here we have the same cycle as in the previous tree. If we revert to (30,130, 230) we see this is dominated by (35,135,235). Could this be the answer? But this alternative is also in a cycle with (40,140,210) and (50,120,220). These examples only hint at the complexities that can arise with three or more players.

A

• We can see no way to come to a rational solution in cases like these. Thus we conclude that decision trees with three or more players do not always have unique solutions.

• Of course not all decision trees have cycles.

• Could we perhaps simply say that any trees with cycles can be ruled out, and only those without cycles have solutions?

(assuming of course we can identify all the

cases with cycles)

• The answer is NO: Many cases (but not all cases) with cycles in subtrees have solutions.

B (3,13,23)

A

(4,14,21)

C

(5,12,22) (1,11,24)(2,15,25)

C

(x,y,50)

The existence of a cycle at A is obviously irrelevant: C goes left and obtains the maximum possible payoff, whatever are x and y. Many less trivial examples could be shown.

• The existence of cycles also means that a decision tree may have some decisions at any level of the tree that are rational even when no unique solution exists.

• Suppose we give the top level of the tree two choices, each leading to a cycle with identical structure to that we have discussed, except that the left branch has higher payoffs, so that every payoff to every player in the left branch is higher than the highest in the right branch.

• Whichever branch is chosen will not have a solution, yet the player at the first branch should obviously choose the subtree with the higher payoffs.

• Such an example also reaffirms the fact that one cannot simply ignore any branches with cycles. E.g. suppose the player at the first branch in this example had a third choice for which all players payoffs are lower than any other payoffs in the rest of the decision tree. Obviously this branch should not be chosen.

• This example leaves us with unanswered questions about the trees with three or more players.

• When cycles do not exist, an answer can be found (we believe). We are developing a (massively recursive) algorithm that (we believe) shows when cycles exist and when they do not, and finds the solution when there are no cycles.

• When cycles do exist, some cases have solutions and some do not, but identifying which are which remains an open issue.

• How might we develop an algorithm that will identify cases without cycles and give the solutions for those cases?

• We are getting close to producing an algorithm of this sort, but it is not quite ready for posting. When ready, we will post it on this site and powerpoint.

• If there were rational solutions for all subtrees at a node, then the selfish choice among these would provide a provisional solution.

• We then need to start over at the terminal nodes and work upwards to see if there is convergence on a better alternative. Such an alternative would have to give the chooser and at least one other player a larger payoff than the provisional.

B (10,0,3)

A

(0,10,0)(15,5,1)

Here is the simplest example: The provisional solution is (10,0,3). (15,5,1) is obviously an alternative that will be the rational solution, even though it does not benefit C relative to the provisional. Of course the reason is obvious: C has no control over any of the choices. Inthe next case C does have control:

B (10,0,3)

A

(0,10,0)

(15,5,1) (4,4,2)

Here C would not allow (15,5,1) to be a solution because C prefers (10,0,3) to (15,5,1) and would therefore choose (4,4,2) if given the choice. Knowing this, A would choose the provisional and this would be the answer. These examples show that it is critical to consider who controls the choices at each node. More precisely, a player making a choice will not choose an alternative path that gives that player a smaller payoff than the provisional solution when there is any other choice that will give the chooser more than the alternative path.

C

B (10,0,3)

A

(0,10,0)

(15,5,2) (4,4,1)

Here the answer is (15,5,2). A and B would choose the path leading to C, and C prefers 2 to 1. Just as in the two person game, C must act selfishly and is not allowed to make a choice that gives C less in order to ‘threaten’ the other players.

C

• In our proposed general algorithm, an embedded recursive procedure is used: At every stage a player A at a node will be considering current options passed up from the level below.

• There will be a series of initial (G) and alternative (F) provisional solutions established for players Aj at successively higher nodes, giving payoffs of G(Aj1), F(Aj1), G(Aj2), F(Aj2),…, respectively.

• At any given stage of the general recursion there could be none of these, or as many as two for every higher decision stage.

• A will choose the best personal option among those options passed up from below that give all of the higher Aj more than their respective G payoffs.

• If none exist, A chooses the best available personal option among those passed up from below. This might be among the choices favoring all the players Aj above, in which case it might:– 1) survive and become an alternative F and eventually a

provisional G at various stages above as the iteration continues, or

– 2) might be a personal choice that gives some of the Aj poorer payoffs than their G provisionals, in which case this choice will fail at some higher stage, but nonetheless blocks some other choices that could have been made.

• Having made such a choice (if it is the first such choice at this node as the current iteration moves up from the leaves) then it becomes the G provisional. One returns to the bottom leaves of this subtree, and iterates. If a better alternative is found and survives then it becomes the F alternative at this node. If continued iteration converges on some F*, then F* becomes the G at this node. We now iterate again. Why? Because lower nodes must now satisfy the new G at this node (as well as the higher G’s for the respective players) . If convergence occurs, this subtree conditional solution is passed up to the next level.

• During all these almost uncountable passes up and down the tree, a cycle may occur. If so, the whole process described here ends. There may still be a rational solution, but some other algorithm must be developed for such cases.

• However, it is possible that no cycle occurs throughout the entire process. If so the end result is the rational solution.

• We have just worked our way up to K. X chooses the best of the choices below K to get G(K). We start over at the bottom, and work our way up to J, always getting conditional rational solutions respecting G(K). A(J) makes the selfish choice of the options at J-1, and this is the provisional G(J|G(K). We start over at the bottom, now respecting both provisionals. We reach L with Z making the selfish choice among the bi-conditionals at the level below L.

• Each chooser of a conditional choice first tries to find a selfish solution that gives the chooser and the choosers at all G’s above a better outcome than the current G’s. If none exist, the chooser makes the selfish choice.

• Whatever provisional exists, say at L, the algorithm iterates from the bottom, producing an alternative F at L. This alternative is then checked by starting over at the bottom. When we reach L the player at L makes the choice that gives that player more than F and the players at J and K more than their G’s. If none exist, the L player makes the selfish choice. This iteration at L continues until convergence or a cycle. If convergence occurs on some F then this F becomes the new G at L. Then we iterate. If we converge on some G at L then this solution is sent to the next higher level (and lower provisionals are removed).

• This continues until J is reached. The J player makes the selfish choice trying to respect higher G’s, etc. Supposing all alternatives at J converge, the result is passed up one level.

• This recursive process continues to K, where perhaps an alternative F emerges. Iterate until all alternatives at K converge, and then the result is passed up.

A(K+1)

A(K)

A1(K-1) A2(K-1)

G(K)

A(J)

F(K)

G(J|G(K)

A(L)G(L|J(P(K),P(K))

A = player at a nodeG = provisional solution to be respectedF = Alternative to G to be testedK,J,L = nodes in the tree

F(L|..)

• The full recursion, as illustrated in the previous slides may be hard to follow, so perhaps it is best to provide the computer program that executes the algorithm, rather than a verbal description of the algorithm.

• Regardless, if the algorithm is correct, we have made some progress: The proposed algorithm identifies the absence of a cycle, or the existence of at least one or more cycles. If a cycle occurs only at the terminal node, there is no solution. If a cycle occurs nowhere, then the algorithm identifies the rational solution.

• A note on the steps required by the algorithm:

• Due to the recursive nature of the algorithm, many steps can be required. At a given node let us say N steps have been used to determine the R. Suppose the next higher node has M choices, each subtending a subtree of similar complexity. Then the provisional solution there needs NM steps.

• Everything is then redone, starting over at the bottom. Each iteration of alternatives to the provisional at the new level requires another NM steps—there might be K of these iterations before convergence. Think of Chinese Rings, or Tower of Hanoi.

• Thus N steps at level j becomes NMK at j+1, and so on as we move up the decision tree. If we have V levels the number of steps could be of the order (KM)V.

• Humans might be able to reason their way through simple problems with a few choices and levels, and might be able to apply the algorithm by hand for slightly larger problems. Computers could deal with even larger problems, but even computers would quickly reach computational limits.

• One ‘simple’ examples illustrates the steps for a problem without a cycle.

A1

(4,1,7)kB1

C3

(6,6,6)e (7,8,1)f (8,7,4)g (5,4,9)h

C2

B3B2

(9,3,5)a (2,9,2)b (1,10,10)c (10,5,3)d

The next slides detail the recursion, but note that the first pass produces a G of K at A1. We start over: TheG at B1 becomes h. We start over. An alternative d emerges at B1. We start over, and d converges at B1, becoming the new G there. We start over. Now g becomes the alternative at B1. It converges there and so becomes the new G at B1. No alternative emerges, so g is passed up to A1. It becomes the new alternative at A1. We start over. The iteration through alternatives and provisionals continues until convergence at g occurs at A1.Because no cycles emerged through the iterations, the final solution is g

A2 B4(3,2,8)i

Detail the example recursion here

• (slides to be inserted later)

• Let us assume we have achieved a partially applicable, but correct algorithm.

• What does it mean to argue for a rational algorithm when one starts by assuming that rationality is a social construct?

• At the least we hope to be able to proved the algorithm is correct via deduction: That is, it had best be the case that acceptance of the premises leads to the algorithm.

• But this is only the first step. We must convince the ‘deciding public’ that:

• the premises are reasonable, and

• the algorithm produces solutions that the ‘deciding public’ agrees are rational

• That the premises are reasonable is somewhat arguable, and we could expect humans to disagree to a greater or lesser degree.

• That the solutions are rational depends both on the acceptance of the premises and the rationality of the algorithmic answers. For the simple cases we have examined the conclusions seem reasonable (for example settling on the Pareto maximum when one exists).

• Even if the ‘deciding public’ is willing to go this far, the proposed algorithm is only ‘the answer’ until a reasonable alternative arises.

• For example, perhaps an example can be generated for which the algorithmic answer seems to most people (experts) to be incorrect. Such an example would lead to a search for a better approach.

• Of course it is not easy (!) to look to more complex problems in the search for unreasonable algorithmic answers:

• As the decision trees increase in complexity, a point is quickly reached at which there is no way for a human to judge whether an answer is reasonable: The reasoning required to assess rationality quickly exceeds the capacities of human thinking.

• Thus the best we seem able to do is use simple examples to determine whether the algorithm is rational, and then follow the algorithm blindly for complex cases.

• It could well be the case that the algorithm dictates a solution for a complex tree that ‘appears’ terrible to human players, for reasons they cannot understand. Perhaps a different solution ‘appears’ better to most people.

• If rationality is a social consensus, should the algorithmic solution be rejected due the judgment based on limited reasoning ability? I don’t think so, because rational humans know some problems are too complex to remain available to human reason without algorithmic aid, and hence convergence on a rational algorithm might have to be sufficient.

• This work in progress, so comments and criticisms of all sorts would be greatly appreciated.

• Send any thoughts to the authors:

[email protected]

[email protected]

[email protected]

mailto:[email protected]

b c d c (5,5) (-10,20) a d (20,-10) (-5,-5)

Documents

rational players

rational solutions

rational human players

rational game solution

rational gamesrational

real players

b actsselfishly

multiplayer games