MA300.2 Game Theory 2005, LSE Answers to Problem Set 2 [1] (a) This is standard (we have even done it in class). The one-shot Cournot outputs can be computed to be A/3, while the payoff to each firm can be computed to be A 2 /9. (b) Suppose that each firm tries to restrict its output to x. Observe that the joint monopoly output is A/2, so the per-firm joint monopoly output is A/. It suffices, therefore, to examine outputs that lie between this value and the Cournot solution; i.e., A (1) x < A 3. The normalized per-firm payoff when each firm produces x is just (A 2x)x, while a single-period deviation followed by one-shot Cournot forever yields a payoff of (1 β)[max y (A x y)y] + β(a2 /9) = (1 β)[(a x) 2 /] + β(a 2 /9). It follows that x can be supported using Nash reversion if and only if (A 2x)x (1 β)[(a x) 2 /] + β(a 2 /9) or equivalently (do some transposing of terms) if β (A x) 2 (A 2x)x. (A x) 2 A2 9 Note that the fraction on the right hand side is well-defined, because both denominator and numerator are strictly positive why? Hint: use equation (1). Also note that this fraction is strictly smaller than 1. Once again why? Hint: use equation (1) again. So this means that for discount factors close enough to 1 we can achieve even the most collusive level, which is the case in which x = A/. Indeed, substituting x = A/ in the condition above, we see that it reduces to β 9/17. Here is another interesting question. No matter how small β is, as long as it is not exactly zero, can some amount of collusion be supported, perhaps very small? Using the condition above, how would you answer this question? [Hint: find out what happens to the fraction on the right-hand side as x converges to A/3, the Cournot output.] (c) Can asymmetric outcomes (the two firms producing different outputs) be supported using Nash reversion? Of course, provided the discount factor is large enough. Using the condition in part (a), you can try this: suppose that one firm produces some fraction of the monopoly output and the other firm the remainder. As long as each firm s share of profits yields it more than the Cournot profits, such an arrangement can be supported for β close enough to 1.
2 [2] Repeated Cournot with more severe punishments. Do a twist on the previous model and imagine that the demand curve is defined for all prices, positive or negative, using the same equation P = A X. (a) What is each firm s security payoff? It is zero. Each firm can guarantee 0 by producing nothing. Moreover, he cannot guarantee any more than 0, because the other firm can always bring price down to zero or less by producing enough output. (b) Find the minimum discount factor for which there is a symmetric punishment path driving each player down to her security level. If such an equilibrium punishment is to exist, notice that the one-shot payoff from a deviation in the very first period of the punishment must not be more than zero. For the equilibrium conditions imply that if a(0) is the action taken in the first period of the punishment, 0 (1 β)d(a(0)) + β0, where d(a(0)) is the deviation payoff. The above clearly implies that d(a(0)) 0. This argument shows that the initial output must be at least A units per firm. For if it is not, then one firm can earn a strictly positive payoff by deviating. This means in turn that the first period price must be p(0) = A 2a(0) A 2A = A. So we may conclude that the first-period quantity should be at least A per firm, leading to a price of at most A and therefore a payoff of at most per firm. A 2 What about the second period onwards? Let the payoff per firm (normalized) be v from this point on. Now, because the overall payoff is presumed to hit the security level, it must be that A 2 (1 β) + βv 0, or (2) v 1 β β A2. This is a minimum requirement. If v needs to go up (see below) so that (2) holds with strict inequality, we can still easily guarantee that the overall punishment payoff is zero by asking firms to produce a still bigger quantity in the first period, so that their profits fall even below A 2 in that initial phase. All that remains is to make sure that v itself can be supported. [For if it can, then the entire punishment path becomes supportable, as there will be no incentive to deviate during the initial phase of the punishment.] To this end, define a symmetric quantity x such that normalized payoffs (A 2x)x equal v. Now, x must be supportable by restarting the punishment with value 0; i.e., (A x)2 (A x)2 (3) (A 2x)x (1 β) + β0 = (1 β).
So the two conditions we need are first, equation (3), and second, the condition that () (A 2x)x = v 1 β β A2. Notice that the last condition itself implies that 1 β 1 β 8, simply because the monopoly output attains the highest per-firm payoff, which is A 2 /8. This means that β 8/9 is necessary. It is also sufficient, because we know in fact that the monopoly output is supportable using Nash reversion when β 9/17 (see question 1). It must therefore also be supportable using the more severe punishment of 0. Notice that this exercise tells us that it may not always be the most efficient thing to do to try and solve out for punishments that attain the security level. If they do well and good. But often attaining such a punishment may require too high a discount factor (compare 9/17 with 8/9.) Yet it is still true (though this exercise does not ask you to do this) that punishments that are stronger than Nash reversion are always available. [3] Auctions. n firms are bidding for a government contract which requires the provision of a service. (a) What is the one-shot equilibrium of this auction? In one equilibrium each firm bids c. The contract is randomly awarded to one of the firms. Each firm earns a net payoff of 0. If n > 2, there are other equilibria as well, but all these equilibria yield the same payoffs and must involve at least two firms bidding c. 1 (b) Assume the auction is repeated infinitely often, and the bidders form a collusive bidding ring. That is, each ring consists of a sequence of n steps, repeated forever. In each sequence, at step i, firm i is allowed to win the object. Suppose that the prearranged collusive price is denoted by p. Any undercutting is followed by reversion to the equilibrium of part (a), yielding zero payoff to all. The trick here is to observe that any firm has the greatest incentive to undercut the very next firm to him in the sequence. This is the point at which his next winning turn is furthest away, and therefore the most discounted. This is, therefore, the point at which he is most tempted. More formally, let us calculate the payoff to a firm when everyone abides by the collusive practice and it is his turn to win: it is (p c)[1 + β n + β 2n +...] = p c 1 β n, simply because subsequent wins for a particular firm are separated by periods of length n. It follows that our firm s payoff from abiding by the ring at the most tempting instance is β n 1 p c 1 β n, 1 There are other mixed strategy equilibria but they all disappear if bids lie in some bounded interval. 3
because it is now time for his successor to win, so that our firm s next winning moment is n 1 periods away, and is therefore discounted by β n 1. On the other hand, by cutting his successor s throat by a tiny amount he can get almost p c today, and nothing thereafter (once the punishment starts up), so the appropriate incentive constraint is β n 1 p c 1 β n p c. Removing common terms, the required condition boils down to β n 1 + β n 1. Notice that this condition gets ever harder to satisfy as the number of firms in the ring increases (what s the intuition for that?). For two firms, the condition is which yields [] (a) The expected payoff is given by β + β 2 1, 5 1 β 0.62. 2 (5) p 2 u(h) + p(1 p)u(h t) + p(1 p)u(l + t) + (1 p) 2 u(l). If you max this using t, we see that the first order condition tells us: u (h t) = u (l + t), which means that the transfer should always be chosen to equalize consumption ex post. This is the best insurance that the two farmers can provide each other. That does not mean that the best insurance scheme is incentive compatible, as the deviation gains from such a scheme will be high. That is the subject of the next questions. (b) No one giving a transfer to the other, regardless of history, is a subgame perfect equilibrium of the repeated game. Suppose that each farmer uses the strategy: never transfer anything to the other, and accept all transfers that come your way. Then it is a best response for the other farmer to use the same strategy. So it is always an equilibrium to not help each other out. The payoff from this one-shot equilibrium, normalized, is pu(h) + (1 p)u(l). (c) We need to consider a farmer who has just received a high output while his partner has received a low output. If the farmer abides by the scheme, he receives u(h t) today and from tomorrow on the expected value as described by (5). This amounts to a total of (1 β)u(h t) + β[p 2 u(h) + p(1 p)u(h t) + p(1 p)u(l + t) + (1 p) 2 u(l)] If he refuses to make the transfer, he gets more today u(h) but from tomorrow is left to himself. So the overall constraint is (1 β)u(h t) + β[p 2 u(h) + p(1 p)u(h t) + p(1 p)u(l + t) + (1 p) 2 u(l)] (1 β)u(h) + β[pu(h) + (1 p)u(l)].
(d) The left-hand side of this constraint is the utility from abiding. The right-hand side is the utility from defection followed by the punishment. Notice that the two sides are perfectly equal if we set t = 0. Let us differentiate the left hand side of the constraint with respect to t. We get the expression (1 β)u (h t) + βp(1 p)[u (l + t) u (h t)]. If this is always negative, then there is no hope of sustaining any level of transfer because as we ve already noted the two sides of the constraints are equal at t = 0 and then if the derivative is negative the left hand side grows ever smaller while the right hand side stays the same. In fact, because u is strictly concave, the positive terms in the derivative get smaller with t and the negative terms get bigger (in absolute value), so it is sufficient to evaluate this derivative at t = 0, which yields the condition for no-positive-transfers: or (1 β)u (h) + βp(1 p)[u (l) u (h)] < 0, (6) θ < 1 β βp(1 p), where θ [u (l) u (h)]/u (h) may be thought of as a measure of the need for insurance. If this condition holds the insurance market shuts down fully. No transfer is supportable. [5] If β is close enough to zero then every subgame perfect equilibrium must involve the play of a one-shot Nash equilibrium after every history, assuming that the number of actions is finite. To see this, fix any action vector action profile a that is not a one-shot Nash equilibrium. Let g(a) max i [d i (a) f i (a)], where d i (a) is the maximum payoff to player i assuming all others are playing a i. Then g(a) > 0, because a isn t a Nash profile. Let g min g(a), where the minimum is taken over all profiles a that are not one-shot Nash. Because there are only finitely many action profiles, d(a) must be positive. Let M and m be the maximum and minimum payoffs available to anyone in the game. Now pick β close enough to zero such that (1 β)g > β(m m). It is easy to see that for such β it is impossible to punish a player for deviating from a non-nash action profile. The best differential between reward and punishment is, after all, M m, and his gain from deviation is at least g. So there is no other subgame perfect equilibrium than the play of a one-shot Nash in every period. [6] The conclusion of the previous question may be false if there are infinitely many actions available to each player. Take this two-person game in which each A i = [0, 1], and the payoff to player i from (a i, a j ) is 2a j a 2 i. Then each i s (one-shot) dominant strategy is to set a i = 0, but of course, it is easy to check that all symmetric actions (a, a) yield payoffs that are increasing in a (on [0, 1]). For any discount factor, some a > 0 can be supported if 2a(1 β) 2a a 2. (because the rhs is the normalized utility from abiding, while the lefthand side is the one-shot payoff from deviation plus 0 thereafter). 5
6 Divide through by a; the condition is then 2(1 β) 2 a Now matter how close β is to zero (as long as it is not equal to 0), I can find a small enough which is supportable as a subgame perfect Nash equilibrium outcome. [7] Let σ be a pure strategy profile in a repeated game. For any history, player i is absolutely sure of the action profile her opponents are going to take: indeed, the action a j is simply given by applying σ j to the going history. Consider the strategy profile σ i for player i in which she simply plays a one-shot best response to the action profile of the others following every conceivable history. Then her payoff can never drop below (1 β) β t d i (a(h t )) (1 β) β t m i = m i, t=0 where d i (a) is the standard notation for the best payoff to player i when the going strategy profile for the others is a, a(h t ) refers to the action profile of the others following history h t, and m i is the minimax payoff. [8] In this model the monopolist s minimax payoff is 1 he cannot be pushed below that. So to figure out the discount factor at which the monopolist can credibly discount entry, consider the following profile: Initially, and for every history thereafter in which the monopolist has either always fought entry, or no entry has occurred: no entrant enters, and the monopolist fights if there is entry. For all other histories: the entrant enters, and the monopolist accommodates. To find conditions under which this is SGPNE, it suffices to only look at the first kind of history (because after the second kind of history one plays one-shot Nash forever). If there is entry and the monopolist fights, then her lifetime normalized payoff (assuming the strategy profile above will thereafter be followed) is (1 β)0 + β2 = 2β. If she deviates, she gets (1 β)1 + β1 = 1. So she will abide by her strategy if β 1/2. The interesting thing about this equilibrium is its interpretation. The switch in entrant behavior if the monopolist accommodates is formally a punishment of course as far as the monopolist is concerned, but what happens is more that the entrants now feel that their entry will not be attacked any more: the monopolist has shown herself to be weak. So a better interpretation is that now we have a changed belief about how the monopolist will function after she once accommodates, and this allows the entrants to feel relieved and enter. [9] In variant 1 this is impossible, and you don t need to invoke subgame perfection. It is simply a dominant strategy to defenct in every period, providing that the number of rounds is finite. In variant 2 there is a Nash equilibrium in which you can get cooperation in the first round. Assume just a two-period game. In period 1, the profile recommends the play of (U, L). If there is any deviation, then play is (X, Y ). If not, we play (D, R). You can easily check that this is a Nash equilibrium, but it is not subgame perfect. t=0
[10] Suppose that each player j has a threshold m j for acceptance of an offer. So player i can get z i 1 j i m j. Random choice of proposers. If i rejects a proposal, two things can happen. One (with probability 1/n), i gets the initiative and therefore z i = 1 j i m j tomorrow. Two (with probability (n 1)/n), i remains a responder and then gets m i tomorrow. Discounting and then taking expected values, we must conclude that Therefore m i = β 1 n (1 m j ) + n 1 n m i j i 7 = β 1 n (1 j m j ) + m i, so that m i (1 β) = β 1 M, n where M is the sum of all the m i s. The above equation immediately proves that all the m i s must be the same call this common value m so that M = mn. Using this information in the equation above it is immediate that m = β n. Just as in the rejector-proposes variant done in class, these thresholds also converge to equal division as the discount factor goes to 1. But it is also interesting to compare these thresholds with those of the rejector-proposes model, which are in your class notes. You will see that these thresholds are always lower, though they converge to the same limit. Should it surprise you that these thresholds are lower?