Leader or Follower? A Payoff Analysis in Quadratic Utility Harsanyi Economy

Similar documents
ECONS 424 STRATEGY AND GAME THEORY MIDTERM EXAM #2 ANSWER KEY

Microeconomic Theory II Preliminary Examination Solutions

Econ 101A Final exam May 14, 2013.

Sentiments and Aggregate Fluctuations

Topics in Contract Theory Lecture 1

On Existence of Equilibria. Bayesian Allocation-Mechanisms

ECONS 424 STRATEGY AND GAME THEORY HANDOUT ON PERFECT BAYESIAN EQUILIBRIUM- III Semi-Separating equilibrium

Sentiments and Aggregate Fluctuations

Econ 101A Final exam May 14, 2013.

Supplemental Materials for What is the Optimal Trading Frequency in Financial Markets? Not for Publication. October 21, 2016

Comparing Allocations under Asymmetric Information: Coase Theorem Revisited

PAULI MURTO, ANDREY ZHUKOV

Reputation and Signaling in Asset Sales: Internet Appendix

Online Appendix. ( ) =max

ECE 586BH: Problem Set 5: Problems and Solutions Multistage games, including repeated games, with observed moves

Stochastic Games and Bayesian Games

Asymmetric Information: Walrasian Equilibria, and Rational Expectations Equilibria

Credible Threats, Reputation and Private Monitoring.

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

In reality; some cases of prisoner s dilemma end in cooperation. Game Theory Dr. F. Fatemi Page 219

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program August 2017

Two-Dimensional Bayesian Persuasion

Microeconomic Theory II Preliminary Examination Solutions Exam date: June 5, 2017

Practice Problems 1: Moral Hazard

Supplementary Material for: Belief Updating in Sequential Games of Two-Sided Incomplete Information: An Experimental Study of a Crisis Bargaining

Introduction to Industrial Organization Professor: Caixia Shen Fall 2014 Lecture Note 5 Games and Strategy (Ch. 4)

ECO 5341 (Section 2) Spring 2016 Midterm March 24th 2016 Total Points: 100

Socially-Optimal Design of Crowdsourcing Platforms with Reputation Update Errors

Multitask, Accountability, and Institutional Design

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics. Ph. D. Comprehensive Examination: Macroeconomics Fall, 2016

Monetary Policy in a New Keyneisan Model Walsh Chapter 8 (cont)

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

When does strategic information disclosure lead to perfect consumer information?

Information, Market Power and Price Volatility

Market Liquidity and Performance Monitoring The main idea The sequence of events: Technology and information

Chapter 9, section 3 from the 3rd edition: Policy Coordination

Game Theory with Applications to Finance and Marketing, I

Game Theory: Normal Form Games

Finish what s been left... CS286r Fall 08 Finish what s been left... 1

Information Processing and Limited Liability

Price Theory of Two-Sided Markets

Online Appendix. Bankruptcy Law and Bank Financing

Games of Incomplete Information ( 資訊不全賽局 ) Games of Incomplete Information

Signaling Games. Farhad Ghassemi

Optimal Credit Limit Management

Lecture 6 Dynamic games with imperfect information

The test has 13 questions. Answer any four. All questions carry equal (25) marks.

FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.

Supply Contracts with Financial Hedging

Games of Incomplete Information

Lecture 3: Information in Sequential Screening

ECON106P: Pricing and Strategy

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program June 2017

Information Aggregation in Dynamic Markets with Strategic Traders. Michael Ostrovsky

Lecture 7: Bayesian approach to MAB - Gittins index

Imperfect Information and Market Segmentation Walsh Chapter 5

HW Consider the following game:

Solution to Tutorial 1

Answer Key: Problem Set 4

UCLA Department of Economics Ph.D. Preliminary Exam Industrial Organization Field Exam (Spring 2010) Use SEPARATE booklets to answer each question

Regret Minimization and Security Strategies

Chapter 3. Dynamic discrete games and auctions: an introduction

Solution to Assignment 3

Game Theory I 1 / 38

Econ 8602, Fall 2017 Homework 2

Information Sale and Competition

6.254 : Game Theory with Engineering Applications Lecture 3: Strategic Form Games - Solution Concepts

Reputation Games in Continuous Time

PAULI MURTO, ANDREY ZHUKOV. If any mistakes or typos are spotted, kindly communicate them to

Microeconomics II. CIDE, MsC Economics. List of Problems

Game Theory I 1 / 38

Tax Treaties and the International Allocation of Production: The Welfare Consequences of Location Decisions and Strategic Tax Setting

Appendix: Common Currencies vs. Monetary Independence

Microeconomic Theory II Preliminary Examination Solutions Exam date: August 7, 2017

Financial Economics Field Exam August 2011

Accounting Conservatism, Market Liquidity and Informativeness of Asset Price: Implications on Mark to Market Accounting

1 Appendix A: Definition of equilibrium

Problem Set 3: Suggested Solutions

1.3 Nominal rigidities

Microeconomic Theory May 2013 Applied Economics. Ph.D. PRELIMINARY EXAMINATION MICROECONOMIC THEORY. Applied Economics Graduate Program.

Answer Key. q C. Firm i s profit-maximization problem (PMP) is given by. }{{} i + γ(a q i q j c)q Firm j s profit

Problem set 5. Asset pricing. Markus Roth. Chair for Macroeconomics Johannes Gutenberg Universität Mainz. Juli 5, 2010

Auditing in the Presence of Outside Sources of Information

EconS Games with Incomplete Information II and Auction Theory

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Francesco Nava Microeconomic Principles II EC202 Lent Term 2010

FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.

A Model of Financial Intermediation

Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models

Introduction to Game Theory Lecture Note 5: Repeated Games

TFP Persistence and Monetary Policy. NBS, April 27, / 44

For students electing Macro (8701/Prof. Roe) & Micro (8703/Prof. Glewwe) option

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics. Ph. D. Comprehensive Examination: Macroeconomics Spring, 2016

Product Di erentiation: Exercises Part 1

Unobserved Heterogeneity Revisited

Game Theory and Economics Prof. Dr. Debarshi Das Department of Humanities and Social Sciences Indian Institute of Technology, Guwahati.

Lecture Notes on Adverse Selection and Signaling

Learning by Ruling: A Dynamic Model of Trade Disputes

Impact of Imperfect Information on the Optimal Exercise Strategy for Warrants

Microeconomics of Banking: Lecture 5

Transcription:

Leader or Follower? A Payoff Analysis in Quadratic Utility Harsanyi Economy Sai Ma New York University Oct. 0, 015 Model Agents and Belief There are two players, called agent i {1, }. Each agent i chooses actions x i R and his payoff also depends on another agent j s action x j R 3. In particular the agent i s payoff function U i : R 3 R is defined as U i (x i, x j, θ) = (x i + λx j θ) where θ Θ θ is not revealed to both agents, but they share a common prior µ(θ) N( θ, σ θ ). In addition, agent i privately sees signal s i = θ + σ i ε where s i G i (s θ) denote the distribution of the signal for player i, which is a common knowlege to both agents. In particular, we are interested in case in which ε iid N(0, 1). Players use Bayes Rule to update his belief over θ upon observing signals, in particular the posterior kernel over θ satisfies dµ i (θ s) = dgi (s i θ)dµ(θ) Θ dgi (s i θ)dµ(θ) The optimality condition implies that the best response for agent i is x i = θ λe i [x j Ω] where Ω is the information set available for agent i. Email: sai.ma@nyu.edu 1

Nash Equilibrium The Symmetric Nash Equilibruim (Simultaneous choice), we have for player i, after observing signal s i { } x i = f(s i ) = arg max U i (x i, x j, θ)dg(s j θ)dµ(θ s i ) Proposition 1 The Symmetric Nash Equilibrium Strategy Pair f(s 1 ) and f(s ) are linear in its respective s i, and in particular, they have form as follow i {1, } where x i = f(s i ) = α + βs i α = β = (1 γ) θ (1 + λγ)(1 + λ) γ 1 + λγ where γ = σ θ σ ɛ +σ θ Stackelberg equilibrium With loss of generality, assume player 1 to be the Stackelberg leader. Then Stackelberg Equilibrium is a pair of stratgies (f 1 (s 1 ), f (s, x 1 )) such that { } x = f (s, x 1 ) = arg max U (x, x 1, θ)dµ (θ s, x 1 ) { x 1 = f 1 (s 1 ) = arg max } U 1 (x 1, f(s, x 1 ), θ)dg (s θ)dµ 1 (θ s 1 ) Leader or Follower To simplify our analysis, we consider the simplest case in which σ 1 = σ = σ ɛ > 0. Proposition The Stakelberg Equilibruim Strategy Pair f(s 1 ) and f(s, x 1 ) are linear in its respective s i, and in particular, they have form as follow f(s 1 ) = b 1 s 1 + a 1 f(s, x 1 ) = b s + c x 1 + a

where (1 γ) θ a 1 = 1 + λ γ b 1 = 1 + λ a = 0 γ b = 1 + γ c = 1 λγ 1 + γ where γ = σ θ σ ɛ +σ θ Claim 1 characterizes the stackelberg equilibrium in this simple case. Note that given the signals available, in expectation, each agent i chooses his own action x i in order to get as close to θ λx j. 1 Using the result in Claim 1, we have the following result for the payoff functions. For the payoff for the Nash Player U N 1 (f(s 1 ), f(s ), θ) = ((1 + λα) + βs 1 + λβs θ) For the payoff for the leader (player 1) U S 1 (s 1, s, θ) = (a 1 + λc a 1 + b 1 (1 + λc )s 1 + λb s θ) (1) For the payoff of the follower (player ) Numerical Example U S (s 1, s, θ) = (c a 1 + λa 1 + b s + (c a 1 + λb 1 ) s 1 θ) () A simple simulated economy is experimented for illustration purpose. In particular, we are considering an enviornment where θ = 0 and the common prior µ(θ) N(0, ), and ε N(0, ) with λ = 0.5. The payoff for each agent depends on signals observed by both agents, as specified in (1) and () 1 It s easy to verify that E(f(s, f(s 1 )) + λf(s 1 )) = θ 3

Figure 1: Payoff Comparsion between leader and follower. Figure 1 reports the utility for leader and follower for given θ as functions of s 1, s. The black surface represents the area that the leader s payoff exceed that of leader whereas the color surface represents the area that the follower s playoff exceed that of follower. The right plot of Figure 1, provides a "over-head view" of the left plot, which gives more clear view of payoff between follower and leader. The dotted white line represent the case that agent s signal s coincide the true value of θ. In particular, when both agents observe s i = θ, i {0, 1}, they have the same payoff. In addition, for each agent i, if his signal truly reveals θ such that s i = θ, then his payoff is at least higher as other agent s regardless of s j. Now to start with the analysis when neither agent fails to observe the true value of θ, we first consider follower. Follower In general, the follower will update his belief in θ by observing two signals (one from his own, and one from observing x 1 and inferring s 1 in equilibriuim), therefore the extra information about θ intuitively enhance his ability to forecast a more accurate θ and thus achieve a higher payoff than the leader. However, it may not be the case. The extra piece of information from leader may drag follower into a bad forecast. For example, since follower update his belief according to Bayes rule, and thus use sample mean of collected signals as a suffi cient statistics for updating, consequently an extremely bad signal from leader can cause follower s forcast away from the true value θ, which lower follower s payoff. Therefore, we see from Figure that even if leader observe a positive bad signal (away and higher from θ), it is still better to be the leader as long as the follower observe a negative bad signal. Nevertheless, it is better to be the follower in following three cases The first case corresponds to the situation where the leader observes a good signal in the sense that it is close to the true value θ (e.g s 1 = 1, 1) but follower observe a moderately bad signal in the sense that it is away from the true value θ (e.g s =, 3). Second case is that, the follower observe a good signal and follower also observe a good signal (this for example correspond to s 1, s = 1). The intuition for this is that the extra information from leader conveys a more accurate forecast towards θ, and thus increase 4

follower s payoff relatively to the leader who can only observe one signal. The third case corresponding the situation that both leader and follower observe an "same sign" extremely bad signal (e.g s 1, s = 5 or 5). The intuition for this case is due to the fact that the payoff of follower also depends on the action of leader. Since leader observe an extreme signal, thus his action would hugely deviate from the true value θ. Thus, for the follower, when he observe two extreme same sign signal, his belief that θ is in fact extreme valued is enhanced, thus follow the similar action as leader and it turns out it is the leader who suffers more from such belief. Therefore, the first two cases illustrate that if the leader observe a good signal, which will help follower update his belief on θ, along with the fact the follower also observe a good signal or moderately bad signal which in net effect improve his learning of the true θ, and thus increase his playoff. The last case showcase an interesting senario where both agent observe a same-direction adverse signals and both forecast a bad extreme prediction for θ, if this is the case, it is better to be the follower. Leader On the other hand, the leader is better than the follower in two cases. First case is that leader observes a good signal, and follower observe an extreme signal (e.g s = 5). In this case, the extreme bad signal observed by follower neutralize the positive effect of extra information on prediction of θ even though the signal of leader provides an accurate info of the θ. The second case is that leader observe a bad signal (e.g s = 5) whereas the follower observe "opposite-sign" extreme bad signal (e.g s 1 = 5). The intuition of this case is that the extra information delivered to the follower, the two opposite direction signals prevents the follower from updating accurately to the θ, in this case the presence of the extra information hurts follower and consequently benefit the leader even if he observe a signal largely away from θ Effect of Extra Information For the follower, the extra information from leader s action in some cases help him predict the true value θ so that achieve a higher payoff. But in some cases, the presence of extra signal observed by follower can neutralize such positive effect. In order to better understand this argument, Figure reports the marginal payoff for s 1 and s respectively. In particular, we have 5

Figure : Payoff Comparison fixing s 1 Firstly from the upper-left plot, we see that when leader observe the true signal s 1 = θ the leader dominant the follower regardless of follower s signal. That s the typical first-move advantage, when the leader has a perfect information (s 1 = θ). Secondly, as shown in upper-right subplot in figure 3, consistent with the findings in the previous sections, when leader observe a good signal (s 1 = 1) but follower observe a moderately bad signal or good signal (s [0, 1]), this extra information delievered will benefit follower from predicting θ, as a result, the follower will have a larger payoff than the leader. Thirdly, the lower-left/right plots in figures dispicts the fact that when leader observe a bad signal (s 1 = 5, 5), the payoff of follower is lower when leader also observe a "samesign" bad signal (e.g s 1 = 5, s = 5). This depicts the argument that the leader s bad signal can neutralize the positive effect of extra information, and thus the follower suffers from a inaccurate prediction of θ, and thus has lower payoff. However, we should notice that as follower receive a higher signal from 5, as lower-left plot shows, the neutralization effect gradually vanishes and the benefit of extra information begins to dominant, and from that point on, it is better to be a follower. Conclusion This paper analyses the payoff between Stackelberg leader and follower in a quadratic utility Harsanyi economy. Each agent receices a signal regarding a private information θ, and update his belief accordingly. In particular, the leader s belief is solely based on his own signal, but for the follower, in addition to observing his own signal, he can also infer the signal received by the leader. However, this extra information has two opposite effect on follower s payoff. On one hand, an extra signal from leader can help follower update his belief regarding θ more accurately and thus enhance follower s payoff, especially when leader receives a good signal. On the 6

other hand, a bad signal from leader can neutralize such positive effect and lower follower s payoff. This paper shows that a) if the leader observe a good signal and follower also observes a good signal or moderately bad signal, then the extra information can in net effect improve follower s learning of the true θ, and thus increase follower s playoff. b) the extreme bad signal observed by follower can neutralize the positive effect of extra information on prediction of θ even though the leader s signal can itseilf provide an accurate info of the θ. Appendix Proof of Proposition 1. We guess and verify that f(s i ) = α + βs i, and thus (1) becomes { } x i = arg max (x i + λ (α + βs j ) θ) dg(s j θ)dµ(θ s i ) (3) We first consider the inner integrant of equation (): (x i + λ (α + βs j ) θ) dg(s j θ) = E [ (x i + λ (α + βs j ) θ) θ ] (4) Now since ɛ N(0, σ ɛ) and s i = θ + ɛ, then the likelihood ( ) si θ p(s i θ) = φ Now denote z i = s i θ σ ɛ, and thus by construction z i N(0, 1) where φ is the pdf of the standard normal distribution. Thus we have that for the conditional mean, E(s i θ) = s i p(s i θ)ds i = σ ɛ ( s ( ) i θ si θ )φ ds i + θφ σ ɛ = σ ɛ E(z) + θ σ ɛ σ ɛ ( si θ σ ɛ ) ds i = θ (5) Similarly, we have the conditional second moment E(s i θ) = s i p(s i θ)ds i = σ ɛ ( s ( ) i θ ) si θ φ ds i + σ ɛ σ ɛ = σ ɛe(z ) + θ ( (s i θ θ si θ )φ σ ɛ ) ds i = σ ɛ + θ (6) 7

Then using (4) and (5) to set into (3), we have E [ (x i + λ (α + βs j ) θ) θ ] = E [ (x i + λα θ) + λβs j (x i + λα θ) + (λβ) s j θ ] = (x i + λα) (λβ) σ ɛ + (1 λβ) (x i + λα) θ (λβ 1) θ = (λβ) σ ɛ {(x i + λα) (1 λβ)θ} (7) Now, we can evaluate () using (6) (x i + λ (α + βs j ) θ) dg(s j θ)dµ(θ s i ) = (λβ) σ ɛ {(x i + λα) (1 λβ)θ} dµ(θ s i ) (8) = E [ (λβ) σ ɛ {(x i + λα) (1 λβ)θ} ] s i (9) Now using the one-step Bayes map, we have E(θ s i ) = (1 γ) θ + γs i (10) V ar(θ s i ) = σ ɛγ (11) where γ = σ θ σ ɛ +σ θ. Thus we have by (9) and (10), E(θ s i ) = V ar(θ s i ) + {E(θ s i )} = σ ɛγ + { (1 γ) θ + γs i } (1) Now for notation simplicity, denote ζ = E(θ s i ), δ = E(θ s i ), by (8) we have E [ (λβ) σ ɛ {(x i + λα) (1 λβ)θ} s i ] = (λβ) σ ɛ (x i + λα) + (1 λβ) (x i + λα) ζ (λβ 1) δ (13) Note ζ and δ do NOT depend on x i, Now we have compelete the objective function x i = arg max { (λβ) σ ɛ (x i + λα) + (1 λβ) (x i + λα) ζ (λβ 1) δ } (14) Now, take the F.O.C of (13) with respect to x i, we have (x i + λα) = (1 λβ)ζ Thus plug in ζ = E(θ s i ) = (1 γ) θ + γs i, we have x i = λα + (1 λβ) [ (1 γ) θ + γs i ] = {(1 λβ)γ} s i + { λα + (1 λβ)(1 γ) θ } (15) Thus by the method of underdetermined coeffi cient, we have (1 λβ)γ = β (16) λα + (1 λβ)(1 γ) θ = α (17) 8

(15) gives β = Thus plug this into (16), we can back up α γ 1 + λγ (1 γ) θ λα + 1 + λγ = α (1 γ) θ α = (1 + λγ)(1 + λ) (18) (19) Thus using (17) and (18), we solved the symmetric NE, that is i {1, } x i = f(s i ) = (1 γ) θ (1 + λγ)(1 + λ) + γ 1 + λγ s i (0) where γ = σ θ σ ɛ +σ θ Proof of Proposition. By (0), we have We solve the equilibrium backward. The Follower s problem f(s, x 1 ) = arg max E[ (x + λx 1 θ) s, x 1 ] Now, since for bayes updating, different from the NE, the Bayes updating is based on two signals (instead of one), s, and x 1. Since in the equilibrium, the follower knows that x 1 = b 1 s 1 + a 1, therefore, he can infers s 1 = x 1 a 1 b 1.(assuming b 1 0, for the case b 1 = 0, I discuss this "cheap talk" style problem in the Appendix A). Bayes Updating Result for Stackelberg Follower. For follower, different from NE, he observe two signals s 1 (inferred from observing x 1 ) and s (own signal). Since s i = θ + ε, then denote s = s 1 + s Then we have Now denote Thus we have θ s N( σ θ σ ɛ + σ θ γ S = s + σ ɛ σ ɛ + σ θ σ θ σ ɛ + σ θ θ, σ ɛ σ θ σ ɛ + σ θ ( ) E(θ s 1, s ) = (1 γ S ) θ + γ S s1 + s V (θ s 1, s ) = σ ɛγ S The Bayes updating result for leader is same as in NE ) 9

Therefore, imposing s 1 = x 1 a 1 b 1, we find that E(θ x 1, s ) = (1 γ S ) θ + γ S ( x1 a 1 b 1 + s ) Now for notation purpose, denote = γs a 1 + (1 γ S ) θ + γs x 1 + γs s E(θ x 1, s ) = ζ S and V (θ s 1, s ) = δ S Again we use the One-step bayes map, we have E[ (x + λx 1 θ) s, x 1 ] Thus, we have by () and plug into (0), Then the F.O.C with repect to x is = E[(x + λx 1 ) (x + λx 1 )θ + θ s, x 1 ] = (x + λx 1 ) + (x + λx 1 )ζ S δ S (1) f(s, x 1 ) = arg max (x + λx 1 ) + (x + λx 1 )ζ S δ S This gives the optimal strategy for follower f(s, x 1 ) = ζ S λx 1 = γs a 1 x + λx 1 = ζ S = (1 γ S ) θ γs a 1 For notation simplication, denote following + (1 γ S ) θ + γs x 1 + γs s λx 1 ( ) γ S + γs s + λ x 1 () f(s, x 1 ) = b s + c x 1 + a ( ) where b = γs, c γ = S λ, a = (1 γ S ) θ γs a 1. In order to fully characterize the follower s strategy, we need to solve for leader s problem (i.e. solve for a 1, b 1 ). We now move to the leader s problem. By (1), we have { } f(s 1 ) = arg max U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) (3) Now we first consider the inner integrand, we have, U 1 (x 1, f(s, x 1 ), θ)dg(s θ) = E[ (x 1 + λ{b s + c x 1 + a } θ) θ] 10

Now for notation simplicity, denote ϕ = κ θ where κ = (λc + 1)x 1 + λa Then, we have E[ (λb s + (λc + 1)x 1 + λa θ) θ] = E[(λb s + ϕ) θ] = E[ϕ + ϕλb s + (λb ) s θ] where (5) uses the equality derived in (4) and (5) We can further express (5) separating θ = ϕ ϕλb θ (λb ) (σ ɛ + θ ) (4) ϕ ϕλb θ (λb ) (σ ɛ + θ ) = κ + κθ θ λb θκ + λb θ (λb ) σ ɛ (λb ) θ = κ (λb ) σ ɛ + (κ λb κ)θ (λb 1) θ (5) Then we can further evaluate the integral using (6) 3 U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) = E[ κ (λb ) σ ɛ + (κ λb κ)θ (λb 1) θ s 1 ] = κ (λb ) σ ɛ + κ(1 λb )ζ (λb 1) δ (6) where ζ = E(θ s i ), δ = E(θ s i ),defined in the previous section using Bayes updating rule and κ = (λc + 1)x 1 + λa First note that κ = (λc + 1) Then we can derive the optimal strategy for player 1 using the F.O.C of (7) w.r.t x 1 (1 λb )ζ κ = κ κ (1 λb )ζ = κ (1 λb ) [ (1 γ) θ + γs 1 ] = (λc + 1)x 1 + λa Where the last line I plug in the notation ζ = E(θ s 1 ) = (1 γ) θ + γs 1 and κ = (λc + 1)x 1 + λa Thus, re-arrange the term, we solved the strategy for leader x 1 = f(s 1 ) = (1 λb )γ λc + 1 s 1 + (1 λb )(1 γ) θ λa λc + 1 (7) where γ = σ θ σ ɛ +σ θ. 3 Note the Bayes updating result for leader is same as in NE, so E[θ s 1 ] = δ and E[θ s 1 ] = ξ 11

Now using b = γs, c = ( ) γ S λ, a = (1 γ S ) θ γs a 1, we can have b 1 s 1 + a 1 = (1 λb )γ λc + 1 s 1 + (1 λb )(1 γ) θ λa λc + 1 Thus by method of under-determined coeffi cient, we have b 1 = (1 λb γs )γ (1 λ = ( ) )γ λc + 1 γ λ S λ + 1 ( ) γ S b 1 λ λ + b 1 = (1 λ γs )γ b 1 = ( λγs )γ λγ S (1 λ ) We can further simplify b 1 as follows, since γ = γ S (1 + γ) = ( σ θ σ θ σ ɛ +σ θ σ ɛ + σ θ σ θ = σ ɛ + σ θ = γ = γ λγs (1 + γ) (1 λ ), γ S = σ θ σ ɛ +σ θ ) ( ) σ ɛ + σ θ σ ɛ + σ θ Therefore, we have a very important identity relating γ S and γ as follows, Therefore, by (9) we have Similarly, we have a 1 = (1 λb )(1 γ) θ λa λc + 1 γ S (1 + γ) = γ (8) b 1 = γ λγs (1 + γ) (1 λ ) γ λγ = (1 λ ) γ = 1 + λ a 1 (1 λ ) = (1 λ γs )(1 γ) θ λ(1 γ S ) θ a 1 = ( λγs )(1 γ) θ λ(1 γ S ) θ (1 λ ) ( (1 λ) γ + λγ S (1 + γ ) ) θ a 1 = (1 λ ) = ( (1 λ γs )(1 γ) θ λ (1 γ S ) θ γs a 1 ( ) γ λ S λ + 1 1 )

Now using condition (9) γ S (1 + γ) = γ again, we have Thus we have for follower, a 1 = = (1 γ)(1 λ) θ (1 λ ) (1 γ) θ 1 + λ a = (1 γ S ) θ γs a 1 = (1 γ S ) θ γs (1 γ) θ γ ( γ(1 γ S ) γ S (1 γ) = γ = θ ( γ γ S (γ + 1) ) γ Now using condition (9) γ S (1 + γ) = γ again, we have Similarly we have a = θ (γ γ) = 0 γ b = γs = γ (1 + γ) = γ 1 + γ c = γs λ = γs + γ S λ λγ γ = 1 λγ 1 + γ ) θ = γs + γ S λ λγ S (1 + γ) γ S (1 + γ) where the first and last equality also use the (9) that γ S (1 + γ) = γ Again, we see that both f(s 1 ) and f(s, x 1 ) are LINEAR in its respective s i Cheap Talk Equilibrium In this section, I will discuss the possible case that the leader s action x 1 deliever NO information of s 1. This relates to the Equilibrium in Cheap Talk game that the leader will not provide infomation of his signal from his action observed by the signal. In this case, it corresponds to the case that f(s 1 ) = b 1 s 1 + a 1 where b 1 = 0, such that x 1 = f(s 1 ) = a 1. In this case, for the follower, when he observe x 1, he does not obtain any information about s 1. Therefore, for the follower, different from the case in the main text, the bayes updating rule only based on his own signal s (rather than TWO signals as in general case). 13

On the Path Strategy Follower s Problem (Player ) Firstly, we consider the case that the follower observe x 1 = a 1, (thus leader follows the equilibriuim path and follows the strategy x 1 = a 1 ) The problem facing for follower is same as before, f(s, x 1 ) = arg max E[ (x + λx 1 θ) s, x 1 ] Again we use the One-step bayes map based on his own signal s, we have E[ (x + λx 1 θ) s, x 1 ] = (x + λx 1 ) + (x + λx 1 )ζ δ where ζ = (1 γ) θ + γs. Therefore, the problem becomes, (imposing the equilibrium condition that x 1 = a 1 ) f(s, x 1 ) = arg max (x + λa 1 ) + (x + λa 1 )ζ δ Then the F.O.C with repect to x is x + λa 1 = ζ This gives the optimal strategy for follower f(s, x 1 ) = ζ λa 1 = (1 γ) θ + γs λa 1 Leader s problem (Player 1) The problem leader s facing is still that, { } f(s 1 ) = arg max U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) (9) Now we first consider the inner integrand, we have, U 1 (x 1, f(s, x 1 ), θ)dg(s θ) = E[ (x 1 + λ{(1 γ) θ + γs λa 1 } θ) θ] Now for notation simplicity, denote ϕ = κ θ where κ = λ(1 γ) θ λ a 1 + x 1 Then, we have E[ (x 1 + λ{(1 γ) θ + γs λx 1 } θ) θ] = κ (λγ) σ ɛ + (κ λγκ)θ (λγ + 1) θ Then we can further evaluate the integral, U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) = κ (λγ) σ ɛ + κ(1 λγ)ζ (λγ + 1) δ where ζ = E(θ s i ), δ = E(θ s i ),defined using Bayes updating rule 14

First note that κ = λ(1 γ) θ λ a 1 + x 1 implies κ = 1 Then we can derive the optimal strategy for player 1 using the F.O.C of (7) w.r.t x 1 (1 λγ)ζ κ = κ κ (1 λγ) [ (1 γ) θ + γs 1 ] = λ(1 γ) θ λ a 1 + x 1 Thus, re-arrange the term, we solved the strategy for leader x 1 = a 1 = (1 λγ)γs 1 + (1 λγ λ)(1 γ) θ + λ a 1 Now using the method of under-determined coeffi cient, we have This implies that (1 λγ)γ = 0 (1 λγ λ)(1 γ) θ + λ a 1 = a 1 a 1 = Therefore, we reach the following claim. (1 γ)(1 λγ λ) θ (1 λ ) Claim 3 the Cheap Talk equilbirium exists only when (1 λγ)γ = 0, or 1 = λγ (30) If we do have the parameter that 1 = λγ, we have for the follower, Off-Path Strategy f(s, x 1 ) = (1 γ) θ + γs λ(1 γ)(1 λγ λ) θ (1 λ ) Now consider the case that the follower observe x 1 a 1. Thus the follower knows that the leader is doing an off-path strategy. Now the follower will choose the also only observe his own signal (since the leader s strategy delivers no information about leader s signal), and choose his optimal strategy given x 1 15

Follower s problem Again the follower s problem is same as before, f(s, x 1 ) = arg max E[ (x + λx 1 θ) s, x 1 ] = (x + λx 1 ) + (x + λx 1 )ζ δ where last equality, we use the One-step bayes map, (since in cheap talk, observing x 1 delievers no info regarding the signal s 1,so the bayes rule is based on one signal s ) Then the F.O.C with repect to x is x + λx 1 = ζ This gives the optimal strategy for follower f(s, x 1 ) = ζ λx 1 = (1 γ) θ + γs λx 1 (31) Leader s Problem Similarly, for the follower, { } f(s 1 ) = arg max U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) (3) Now we first consider the inner integrand, we have, U 1 (x 1, f(s, x 1 ), θ)dg(s θ) = κ (λγ) σ ɛ + (κ λγκ)θ (λγ + 1) θ For notation simplicity, denote ϕ = κ θ where κ = λ(1 γ) θ + (1 λ )x 1 Then we can further evaluate the integral U 1 (x 1, f(s, x 1 ), θ)dg(s θ)dµ(θ s 1 ) = κ (λγ) σ ɛ + κ(1 λγ)ζ (λγ + 1) δ where ζ = E(θ s i ), δ = E(θ s i ),defined using Bayes updating rule and κ = λ(1 γ) θ + (1 λ )x 1 First note that κ = 1 λ Then we can derive the optimal strategy for player 1 using the F.O.C of (7) w.r.t x 1 (1 λγ)ζ κ = κ κ (1 λγ) [ (1 γ) θ + γs 1 ] = λ(1 γ) θ + (1 λ )x 1 16

Where the last line I plug in the notation ζ = E(θ s 1 ) = (1 γ) θ + γs 1 and κ = λ(1 γ) θ + (1 λ )x 1 Thus, re-arrange the term, we solved the strategy for leader x 1 = f(s 1 ) = (1 λγ)γ (1 λ ) s (1 γ)(1 λγ λ) θ 1 + (1 λ ) Note that for cheap talk equilibriuim exist, by claim 3, 1 = λγ, thus the optimal leader strategy is f(s 1 ) = (1 γ)(1 λγ λ) θ (1 λ ) which coincides the on-path strategy. Therefore, this off-path strategy punish leader that, indeed, any choice of x 1 (1 γ)(1 λγ λ) θ will not maximize leader s payoff. (1 λ ) Now we can summarize our findings Summary 4 Suppose there exist a cheap talk equilibriuim, in particular 1 = λγ, then the grim-trigger strategy for follower is 4 { (1 γ) θ + γs f(s, x 1 ) = + λ (1 γ) θ if x (1 λ ) 1 = (1 γ)(1 λγ λ) θ (1 λ ) (1 γ) θ + γs λx 1 otherwise and the optimal strategy for leader is x 1 = f(s 1 ) = λ(1 γ) θ (1 λ ) Remark: The assumption for existence of cheap talk equilibriuim is too restrctive, especially, when we are assuming λ < 1 then λγ < 1 since γ < 1 by construction, and thus such equilibriuim fails to exist. 4 Of course, the off-path strategy is not unique, this off-path maximize follower s payoff given his signal, and punish the leader the fact that his choice is NOT maximizing leader s payoff 17