A bit of game theory

A very fast intro to classic game theory

Games

Historically, game theory developed to study the strategic interactions among rational decision makers (players) who try to maximize their payoffs. Consider the following game of Matching Pennies between two players A and B. Both A and B contemporaneously place a penny on the table. Let H be A’s strategy of playing heads (placing heads up) and T that of playing tails. Similarly, let h be B’s strategy of playing heads and t that of playing tails. If the pennies match, then A wins B’s penny (and keeps his own); if not, B wins A’s penny (and keeps his own). The game can be conveniently represented by a strategy matrix:

Player B
Player A		h	t
	H	+1;-1	-1;+1
	T	-1;+1	+1;-1

Table 1

Here is how to read the matrix. The top-left column with +1;-1 tells us that when both A plays H and B h, A wins one penny and B loses one; the box immediately below shows that when A plays T and B h, A loses his penny and B wins one. The rest of the matrix is read analogously.

Matching pennies is a zero-sum game in that whatever one player wins must be lost by another player. Since there are only 2 players, one’s wins are the other’s losses: the game is one of pure competition.

Note that in Matching Pennies each player knows:

The payoffs of each player and the strategies available to each player
The fact that each player knows that all the players know this.

A game in which such knowledge is available is a game of complete knowledge. Henceforth, unless otherwise stated, we consider only games of complete knowledge.

Equilibrium

The central notion for studying games is that of equilibrium, namely, a combination of strategies such that each player uses a best strategy, namely one most conducive to maximizing his payoff, given that the other players are playing such and such strategies. (What counts as a best strategy depends on the type of equilibrium, as we shall see). Solving a game consists in exhibiting its equilibrium or equilibriums if it has more than one.

Dominance

The simplest type of equilibrium is dominance equilibrium. Consider the following (table 2) strategy matrix, where S1 and S2 are A’s strategies and s1 and s2 B’s.

Player B
Player A		s1	s2
	S1	1,1	-2,2
	S2	2,2	0,3

Table 2

A brief analysis of the payoffs shows that A should adopt strategy S2 no matter what strategy B adopts, as in each box S2 has a greater payoff than S1. If by E(Si,sj) we understand the payoff of playing Si against sj, then

Sh strongly dominates Si if and only if E(Sh,sj)>E(Si,sj) for all sj’s.

We indicate this by Sh>Si. So, a strategy Sh has always better payoffs than strategy Si if and only if Sh>Si. In our case, S2 strongly dominates S1, that is, S2>S1. Note that dominance is a relation among the strategies of one player; hence it does makes sense to say (falsely) that S1 dominates S2, or to say that s1 dominates s2 (does it?), but not that, say, S1 dominates s2. When in a game both players have a dominant strategy, then the game has a dominance strategy equilibrium. In the game we are considering, S2 and s2 provide such equilibrium. We shall then say that (S2; s2) is the dominant equilibrium of the game. Notice that an equilibrium need not be fair, as in the game 1,1 or 2,2 are fair outcomes while dominant strategy equilibrium outcome 0,3 is not.

In the game in table 4, neither of B’s strategies dominates the other. However, if we look at A’s strategies, we see that S1>S2.

Player B
Player A		s1	s2
	S1	4;4	2;-2
	S2	3;-3	-4;4

Table 4

Now both players know that S1>S2, and therefore both will reason as follows. As S1>S2, player A will always choose S1 no matter what. Hence, the S2 row can be deleted. But now, s1>s2, and therefore the column for s2 can be eliminated. This leaves only one strategy per player, namely (S1; s1), which is the solution to the game. The solution is reached by dominance iteration, namely by the sequential elimination of dominated strategies. Games that are solvable by the elimination of dominated strategies are dominance solvable. It is a nice feature of dominance solutions that they are unique.

Even when dominance iteration does not lead to equilibrium, it is important to apply it whenever possible in order to simplify the game.

If a strategy Si has never worse and at least one better payoff than strategy Sj then Si weakly dominates Sj; that is,

Sh weakly dominates Si if and only if E(Sh,sj)≥E(Si,sj) for all sj’s and E(Sh,sj)>E(Si,sj) for some sj’s.

We symbolize the fact that Sh weakly dominates Si with Sh≥Si. It is possible to simplify games by eliminating also weakly dominated strategies. For example (table 5),

Player B
Player A		s1	s2	s3
	S1	2;6	1;3	1;6
	S2	0;6	0;3	1;4
	S3	0;6	1;3	0;7

Table 5

S1 weakly dominates both S2 and S3, which can therefore be eliminated. Then, s2 is eliminated because it is dominated by s1 and s3. So, the game is greatly simplified and turns out to have two equilibriums, (S1, s1) and (S1,s3). (Note that weak dominance solutions need not be unique).

Stag Hunt and Prisoners Dilemma

Matching Pennies is a zero sum game: the interests of the players are diametrically opposite.

The opposite kind of game is a coordination game, in which an outcome that is considered best by one player is considered best by all the others, as in Stag Hunt. Player A must decide whether to hunt a stag (S) or a hare (H); player B must do the same. Both A and B must pre-commit before they know what the other has decided. The problem is that if one choose S, one will be able to kill a stag only if the other has also chosen S, while if one chooses H, one is assured one will kill a hare. Of course, there’s more meat in a shared stag (4) than in an unshared hare (2), but choosing H is playing it safe. In effect, this is a game of cooperation vs. defection in which cooperation provides higher payoffs but at a higher risk.

Player B
Player A		S	H
	S	4;4	0;2
	H	2;0	2;2

The game is not dominance solvable (check it out!), but as 4,4 is clearly the best outcome for everyone, the real problem is the difficulty in coordinating strategies to get there.

Strategically, the more interesting games are mixed motives games, namely ones that are neither zero-sum nor coordination. The most famous mixed motive game is Prisoners Dilemma. Consider the following story. Two criminals are arrested and the prosecutor has not enough evidence to convict either of a serious crime unless one or both confesses; however, the two criminals do not know this. Hence he tells one of prisoners: “If you confess and the other guy does not, I’ll grant you immunity and you walk free. If the other confesses and you don’t, I shall make sure that you get a heavy sentence. If neither of you confesses I shall settle for misdemeanor charges, with the result that you will pay a small fine and walk free. If both of you confess, I shall charge both with a felony but also argue for shorter sentences than you would get if the other guy squeals and you do not.” Keeping in mind that the game is one of complete knowledge, what should a prisoner do?

Here is the strategy matrix, with S representing “keeping silent” and T “talk”, +10 the utility of walking free, -10 that of a heavy sentence, -6 that for felony charges but with shorter sentence, and +6 that for misdemeanor charges:

Player B
Player A		S	T
	S	+6;+6	-10;+10
	T	+10;-10	-6;-6

Table 6

The game is neither a coordination nor a zero-sum game, but it is dominance solvable: T dominates S; consequently, (T,T) provide a dominance equilibrium. No matter what the other does, it’s better to talk: if you squeal and the other does not, you walk free (+10); if you squeal and the other does as well, you get a -6 payoff. At all cost you want to avoid keeping silent when the other confesses. Self-interest prevents both from following (S, S) (both keep silent), which would provide a better outcome for the two together. This is why this game is called “Prisoners Dilemma”: purely self-interested private rationality leads to common failure. One can think of Prisoners Dilemma in terms of cooperation (cooperating with the other by keeping silent) and defection (going at it alone by confessing). Notice two things:

1. If the players are self-regarding (only trying to maximize their own payoffs) communication does not solve the problem: even if I know that you will not squeal, it is still in my self-interest to confess.

2. Even finite iteration of the game need not change its outcome. For example, suppose both players know that the game will be played 10 times. Then A knows that in the tenth round she should confess, independently of what happened in the ninth round. Hence, in the eighth round she should confess because what happens in the ninth does not affect what happens in the tenth, and so on. The same applies to B.

Fortunately, most people do not behave as classical game theory suggests; in fact, there’s ample experimental evidence that very often we tend to cooperate unless we perceive that we are being taken advantage of. Hence, since most people are conditional cooperators, it often makes sense to cooperate, at least initially, unless the stakes are so high that cooperating against a defector leads to immediate big losses. Note that in Prisoners Dilemma it may be reasonable for the self-interested players to set up an enforcer that compels them to choose cooperation: in some cases limiting one’s options is perfectly rational as it maximizes one’s payoffs, as Hobbes famously thought.

When Prisoners Dilemma is played an indefinite amount of times between two players, the structure of the reiterated game is different from that of each Prisoners Dilemma round. We shall come back to that later.

The Prisoners Dilemma can provide a rough strategic description of many real life situations. For example,

· a one shot arms “race” between two counties has the same strategic structure: arming dominates over not arming, but if both countries arm they’ll incur severe expenditures they could avoid by not arming.

· a one shot tariff confrontation between two countries has the same logic: if A raises its tariffs and B doesn’t, A will improve its trading balance, and if A doesn’t raise tariffs and B does, A will do worse than if both raise tariffs. So, raising tariffs dominates over not raising them. Hence, both countries will raise tariffs, with a decrease in business for both.

· Some cases of hidden free-riding, e.g. public good games in which it’s advantageous to all to contribute but more advantageous to free-ride.

Nash equilibrium

Dominance equilibrium in interesting games is rare. However, there are other types of equilibrium. A strategy Si available to A is a best reply to strategy sj if Si’s payoffs are greater than, or at least equal to, those of any other strategy available to A. (B’s strategies are treated in the same way).

A pair of strategies Si and sj form a Nash equilibrium if and only if they are best replies to each other.

In symbols,

E(Si,sj)≥E(U, sj) and E(sj, Si) ≥E(V, Si), for any U, V≠ Si and U,V≠ sj.

When the symbol ≥ is replaced by >, we have a strict Nash equilibrium.

For example, in the following game (table 7)

Player B
Player A		s1	s2	s3
	S1	1,4	0,3	0,-1
	S2	-1,0	5,2	1,0
	S3	2,-1	-1,0	1,1

Table 7

there are no dominant strategies (check it out!). However, you may check out that S2 is a best reply to s2 and vice versa. That is, if A plays S2, then B’s best reply is s2, and A’s best reply to s2 is S2. Consequently, (S2,s2) constitute a Nash equilibrium. The idea is that player A has no incentive to change strategy S2 as long as B follows s2, and vice versa. This entails that if the conditions of the game do not change, a Nash equilibrium is stable. Note that (S3,s3) is a Nash equilibrium as well, although non-strict. Note also that although S2 is a best reply to s3, the latter is not a best reply to the former because s2 does better (2 instead of 0), and therefore (S2,s3) is not a Nash equilibrium.

Abstractly understood, human conventions are Nash equilibriums; for example, if you drive on the right side of the road, it’s best for me to do the same, and vice versa; similarly, if we want to communicate, we must use a shared language, and as long as you don’t change it I won’t change it either; if in our culture we greet each other by shaking hands, it won’t do for me to try to rub your nose with mine; however, if the greeting involved rubbing noses, then it would be a mistake on my side to try to shake your hand if I want to greet you.

Note that

· A dominance equilibrium is a Nash equilibrium, but not vice versa (why?)

· A Nash equilibrium need not be fair and need not result in the best possible payoff for either player. For example, player B would be better off if he followed s1 and A followed S1.

Nash equilibriums are important for two reasons.

· The first has to do with dynamical systems, of which more later. Suppose we leave players out and consider only strategies competing against one another. Typically, they will outcompete each other until they reach a Nash equilibrium, at which point all of them will be optimal in the sense of being best responses to each other, at least as long as the environmental conditions do not change.

· The second reason involves the assumption of the rationality of the players, a more traditional ground for the theory. If the game is of complete knowledge, the players intend to maximize their payoffs, they are rational and try to predict the moves of their opponents, and all this is common knowledge among them, they can avoid much strife by settling for a Nash equilibrium. Here is why. Imagine a demonic book that told every player which strategy to follow to maximize his payoff given that all the others also maximize theirs. If the book is to be fully authoritative, then it must settle for some Nash equilibrium because otherwise at least one of the players would improve his payoff by changing strategy as the other players keep the same strategies. Of course, if the game has more than one Nash equilibrium there may be coordination problems; moreover, if different players maximize their payoffs at different Nash equilibriums, they’ll find it difficult to settle on one. What’s needed in this case is some sort of choreographer that decides what equilibrium will be adopted.

Multiple Nash equilibriums

As we saw above, games may have two or more Nash equilibriums that do not even produce the same payoffs. For example, one can see that Stag Hunt has two Nash equilibriums (which are they?) with different payoffs. Another famous example comes from Battle of the Sexes, which goes as follows. Joe and Jill want to go on vacation. Joe can choose to go to the sea (S) or to the mountains (M), as similarly for Jill, who can choose s (sea) or m (mountains). Joe prefers the sea and Jill the mountains. However, they prefer going together rather than going alone (Table 8).

Jill
Joe		s	m
	S	4;1	0;0
	M	0;0	1;4

Table 8

This game has two Nash equilibriums, (S,s) and (M,m), that do not produce the same payoffs, and therefore are not interchangeable, as obviously Joe prefers the former and Jill the latter (check it out!).

Many situations can be modeled by Battle of the Sexes. For example, two companies A and B want to standardize their products but need to decide what standard to follow, A’s or B’s, or two parties that want to reach an agreement but need to decide on what language to use, and so on. Note that pre-commitment is advantageous in this game: if Joe has already a non-refundable train ticket for the sea, he has an advantage over Jill, another case in which limiting one’s options may be rational. However, Jill might decide to retaliate if she thinks Joe’s pre-commitment amounts to cheating.

How can the players choose among non-interchangeable Nash equilibriums?

In some cases, this can be achieved by appealing to Pareto dominance. An outcome X Pareto dominates an outcome Y just in case for all players X has at least a good a payoff as Y and at least for one player a better one. For example in Stag Hunt (table 9),

B
A		S	H
	S	4;4	0;2
	H	2;0	2;2

Table 9

(S,S) Pareto dominates (H,H), although both are Nash equilibriums. A and B have an identical interest in choosing (S,S) because Stag Hunt is a coordination game, and if both players are perfectly rational, never make mistakes, and both know that, then they’ll end up with (S;S). Note, however, that:

· Pareto dominance does not work in Battle of the Sexes, a mixed motives rather than a coordination game.

· In a real Stag Hunt game (one in which, among other things, players make mistakes) would the Pareto dominant equilibrium be reached if in table 9 the payoffs for (S,H) were (-1000;2), or would A get cold feet, fearing a mistake by B?

In other cases, a common conjecture on how to play the game can be the outcome of cultural practices, human psychological idiosyncrasy, or of having played the game many times before. For example, suppose that two people want to meet in St. Louis at a given time known to both but they cannot communicate to decide where. If they really need to meet, an obvious place is at the Arch, this being the most prominent place in the city.

Mixed strategies

The games we considered up to now are in pure strategies, meaning that there are no probabilities (different from 1) associated with the strategies. However, a player may decide to play his available pure strategies with different probabilities. If we associate probabilities with the strategies of a player, so that Pr(S1)+Pr(S2)+….+Pr(Sn)=1, where Pr(x) indicates the probability of x, the game is one of mixed strategies. Pure strategies are then limiting cases, with one strategy played with probability 1 and all the others with probability zero. So, in Matching Pennies, player A might play H with probability 60% and T with probability 40%.

A game in which a mixed strategy amounts to using S1 with probability 40% and S2 with probability 60% can be interpreted in two different ways:

It could mean that a player, say A, follows S1 40 times out of 100 and S2 60 times out of 100.
It could mean that in 100 games, 40 players in A’s position follow the pure strategy S1 and 60 S2. In other words, instead of referring to the strategy frequencies in the playing of one player, it may refer to the strategy frequencies in a population in which each player plays only one pure strategy.

The second interpretation has more interesting applications than the first, as we shall see.

There are formal techniques for determining the expected payoffs of strategies and finding the Nash equilibriums in games, but we do not consider them here.

The Nash Existence Theorem

The most important theorem on equilibriums in classical game theory is the Nash Existence Theorem:

Every finite sum game has at least one Nash equilibrium in mixed strategies.

Such equilibriums (if the game has more than one) are not interchangeable, unless in two-players-zero-sum games.

Indefinitely repeated games and the Folk Theorem

Up to now, we have considered one-shot games. However, suppose we repeat a game G an indefinite amount of times. Each round of G is called a “stage” of the repeated game. It turns out that the repeated game has different properties than G; for example, there are a great number of Nash equilibriums of the repeated game that are not Nash equilibriums of G. To understand the import of this we need the notions of discount factor and of signaling.

· The discount factor δ is the equivalent in present units of one unit of value to be received one time unit from now.

· Hence, when δ=1, the highest value of δ, one values goods to be received one unit of time from now exactly as much as present goods. Hence, when life is uncertain or the future looks grim, the discount factor is low; in general patient agents act on a basis of a high discount factor, impatient ones on the basis of a low discount factor. In monetary terms, when inflation is high, the discount factor is small, and when inflation is low, the discount factor is high.

· In determining the expected payoff of a strategy in a repeated game in which defection matters it is important to have reliable signals telling one whether other players have defected or not. A signal is public if all the players receive it (otherwise it’s private), and it is perfect if it correctly reports whether a player has defected or not (otherwise it’s imperfect).

Consider now the Prisoners Dilemma with the payoffs listed in table 6, reproduced below:

Player B
Player A		S	T
	S	+6;+6	-10;+10
	T	+10;-10	-6;-6

If the players can use mixed strategies, it turns out that any point in the quadrilateral below, where the abscissa represents player A’s payoff and the ordinate player B’s, refers to a possible payoff outcome. However, since by defecting each player can guarantee he’ll incur a loss of at most -6, only the points in the quadrilateral ABCD represent strategies with payoffs greater than those resulting from universal defection.

Suppose now that each player is given a list of moves that will result, if both follow their list accurately, in an average payoff of a for player A and b for player B, corresponding to point P in the graph. Suppose also that both players know both lists. Then, player A could set up the following trigger strategy:

“As long as player B follows the list, then follow the list as well; however, if B deviates then maximally punish him (in this specific case, defect) forever after”.

Imagine now that player B follows the equivalent strategy. Clearly, the mixed strategies leading to P, constitute a Nash equilibrium (they are best replies to each other), as any deviation from them will result in the application of the trigger strategy which will assure that the deviator (and the other player once the deviator retaliates) will get -6 ever after. In other words, as long as A follows his list, B has no interest in deviating from his, otherwise A would retaliate forever after. The same reasoning works the other way around.

Importantly, there are three assumptions at work:

The discount factor is high enough for the players care enough about their future payoffs for the trigger strategy to work.
Defection signals are public
Defection signals are perfect.

Notice three things. First, in any game any player can always play a strategy that inflicts maximum losses to his opponent compatibly with the fact that similarly any player can minimizes his losses. (Here -6 is the maximin to which the original deviator may be pushed). Hence, the above argument is general. Second, all of this applies to many-players games as well. Third, conditions (2)-(3) are difficult to achieve when many players and the possibility of deception are present; in our story they were met by assuming that both players knew both lists and could therefore check each other moves.

Since P can be any point whatever in ABCD, we have the most basic version of the Folk Theorem:

If (1)-(3) obtain, any point in ABCD (any payoff outcome in ABCD) can be reached by a Nash equilibrium in the repeated game.

So, when the Folk Theorem applies, saying that a certain outcome is the result of a Nash equilibrium is not saying much, as just about any outcome can be the result of a Nash equilibrium. In short, when (1)-(3) apply, Nash equilibriums are plentiful. The theorem can be extended to many cases of public but imperfect information, but we shall not deal with that.