Solving sequential games with backward induction

Many games involve simultaneous plays, or at least plays in which a player did not know what strategy the others had followed until after he had made his move.  However, many games are sequential, and if a player knows the strategies used by previous players the game is one of perfect information.  (Remember that such a game is also one of complete knowledge).  Backward induction can be used to solve such games and obtain Nash equilibria.  We can illustrate the point by considering the three person Pay-raise Voting Game.

There are three legislators who have to decide how to vote on a pay raise bill.  If they vote ‘yes’ they will face popular backlash, with a negative utility of –b.  However, if the pay raise is approved, there will be a utility of +p, with b<p.  The pay raise bill passes as long as at least two people vote for it.  It may be thought that it is best to vote last because then one can see what the other two did, but a tree lets us see that it is not so.

Here the square marked 1 denoted the decision of the first player to vote, those marked 2 the decisions of the second, and so on.  The payoffs are given to the right of every branch.  For example, in the topmost branch (all vote yes) each has a payoff of p-b; in the branch immediately below it (1 and 2 vote yes and 3 votes no) 1’s  and 2’s payoffs are p-b and 3’s payoff is p.

Let us use regressive induction starting with the topmost rightmost square, where player 3 must decide whether to vote yes or no.  If she votes yes, her payoff is p-b and if she votes no her payoff is p.  Consequently, we may assume that she will vote no and prune the topmost branch.  By the same reasoning, we may prune the tree with respect to 3’s decisions (As usual we put an x on the branches which are eliminated).  Based on what one may expect 3 to do, 2 makes his decision.  For example, on the topmost square marked 2, we may prune off the ‘yes’ branch because by choosing that branch 2’s payoff is p-b while by choosing ‘no’ his payoff is p.   The lower decision box for 2 is handled analogously.  We are then left with only two branches for 1’s decision; if he votes yes, his expected payoff is p-b, and if he votes no it is p.  Hence, he should vote ‘no’ and let the other two who vote after him take the heat.  In effect, by voting ‘no’ he forces them to vote ‘yes’ and suffer the voter’s backlash; by contrast, he can claim he voted ‘no’ while getting the raise.  One might think that moving last and knowing what the other players have done is always advantageous.  However, this is wrong.  In many games, though not in all, the player who moves first enjoys a first-mover advantage.  Note that no matter what he does, 1 stands to gain as long as the other two aim at the highest payoffs.   The strategy set (1 votes no; 2 votes yes; 3 votes yes) is a Nash equilibrium: each player’s strategy is the best possible against those of the other two.

A subgame is any part of a game that remains to be played after a given set of moves.  For example, any of the game parts to the right of any box in the Pay-raise Voting Game is a subgame.  It turns out that regressive induction allows us to come up not only with a Nash equilibrium, but with a subgame perfect equilibrium, namely an equilibirum that is Nash not only in the whole game, but also in every subgame.

Regressive induction eliminates what are called “incredible threats” from the tree.  For example, player 3 may threaten to vote ‘yes’ at the top of the tree, thus choosing p-b instead of p.  However, since it is not to her advantage to do so, this threat may be considered non-credible if the game is played once and she aims at maximizing her payoff.  Note, however, that if the game is played several times, at some point representatives 2 and 3 should vote ‘no’ in order to avoid being seen as soft touches or, alternatively, they may require that the voting order be rotated periodically.

Powerful as it is, regressive induction in games has some serious practical and theoretical limitations.  If either of players 2 and 3 perceive 1’s behavior as unfair, they may engage in retribution even at a disadvantage to themselves, although they know that they will not play with player 1 again; in fact, there is powerful experimental evidence that this is exactly what most of us would do.  In short, many people do carry out incredible threats even in one-round games.  As using regressive induction with players who do not use it typically fails to maximize one’s expeced payoff, one should know one’s co-players before deciding what strategy to follow.

In addition to the practical problem we just considered, there is an even more serious theoretical one that does not depend on human psychology.  Consider the following game, Centipede.   Two players, A and B start with \$2 each, alternating rounds.  On the first round, A can stop the game by stealing \$2 from B, or continue the game, in which case he will receive \$1 from Nature.  Now B can either stop the game by getting \$2 from A, or continue, in which case Nature will reward him with \$1.  The game continues until one of the two players stops or both have \$5.

By backward induction, A should stop the game at the first round and pocket \$4.  However, it would be better if he continued the game until he can get \$6 by stopping it at the penultimate round, or, as a second best, until the third round or the end of the game, both with a payoff of \$5.  The problem is caused by the fact that the payoffs oscillate between high and low, although the general direction is upward.  So, although backward induction is unchallenged in decision theory, where the player’s counterpart is incapable of strategic choices, it is far more controversial in sequential game theory, where player engage in strategic decisions.