Repeated Games with Perfect Monitoring

Similar documents
G5212: Game Theory. Mark Dean. Spring 2017

FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.

Introduction to Game Theory Lecture Note 5: Repeated Games

February 23, An Application in Industrial Organization

Repeated Games. September 3, Definitions: Discounting, Individual Rationality. Finitely Repeated Games. Infinitely Repeated Games

Infinitely Repeated Games

ECE 586BH: Problem Set 5: Problems and Solutions Multistage games, including repeated games, with observed moves

Repeated Games. EC202 Lectures IX & X. Francesco Nava. January London School of Economics. Nava (LSE) EC202 Lectures IX & X Jan / 16

The folk theorem revisited

Game Theory. Wolfgang Frimmel. Repeated Games

Economics 209A Theory and Application of Non-Cooperative Games (Fall 2013) Repeated games OR 8 and 9, and FT 5

CHAPTER 14: REPEATED PRISONER S DILEMMA

Warm Up Finitely Repeated Games Infinitely Repeated Games Bayesian Games. Repeated Games

13.1 Infinitely Repeated Cournot Oligopoly

6.254 : Game Theory with Engineering Applications Lecture 3: Strategic Form Games - Solution Concepts

Outline for today. Stat155 Game Theory Lecture 19: Price of anarchy. Cooperative games. Price of anarchy. Price of anarchy

Stochastic Games and Bayesian Games

The Nash equilibrium of the stage game is (D, R), giving payoffs (0, 0). Consider the trigger strategies:

Boston Library Consortium Member Libraries

High Frequency Repeated Games with Costly Monitoring

Maintaining a Reputation Against a Patient Opponent 1

Switching Costs in Infinitely Repeated Games 1

REPEATED GAMES. MICROECONOMICS Principles and Analysis Frank Cowell. Frank Cowell: Repeated Games. Almost essential Game Theory: Dynamic.

Strategies and Nash Equilibrium. A Whirlwind Tour of Game Theory

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India October 2012

Problem 3 Solutions. l 3 r, 1

Outline for Dynamic Games of Complete Information

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

Introductory Microeconomics

Topics in Contract Theory Lecture 1

Repeated Games. Debraj Ray, October 2006

Prisoner s dilemma with T = 1

Optimal selling rules for repeated transactions.

Stochastic Games and Bayesian Games

Microeconomic Theory II Preliminary Examination Solutions

MS&E 246: Lecture 5 Efficiency and fairness. Ramesh Johari

REPUTATION WITH LONG RUN PLAYERS

Communication in Repeated Games with Costly Monitoring

Mixed-Strategy Subgame-Perfect Equilibria in Repeated Games

Discounted Stochastic Games with Voluntary Transfers

Repeated Games. Econ 400. University of Notre Dame. Econ 400 (ND) Repeated Games 1 / 48

In reality; some cases of prisoner s dilemma end in cooperation. Game Theory Dr. F. Fatemi Page 219

Chapter 8. Repeated Games. Strategies and payoffs for games played twice

Spring 2017 Final Exam

Basic Game-Theoretic Concepts. Game in strategic form has following elements. Player set N. (Pure) strategy set for player i, S i.

Game Theory Fall 2003

An introduction on game theory for wireless networking [1]

MA300.2 Game Theory 2005, LSE

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Microeconomics III Final Exam SOLUTIONS 3/17/11. Muhamet Yildiz

Economics 171: Final Exam

Repeated Games. Olivier Gossner and Tristan Tomala. December 6, 2007

FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.

Simon Fraser University Spring 2014

Duopoly models Multistage games with observed actions Subgame perfect equilibrium Extensive form of a game Two-stage prisoner s dilemma

Extensive-Form Games with Imperfect Information

Introduction to Game Theory

Chapter 10: Mixed strategies Nash equilibria, reaction curves and the equality of payoffs theorem

Eco AS , J. Sandford, spring 2019 March 9, Midterm answers

Online Appendix for Military Mobilization and Commitment Problems

SI Game Theory, Fall 2008

Finite Memory and Imperfect Monitoring

M.Phil. Game theory: Problem set II. These problems are designed for discussions in the classes of Week 8 of Michaelmas term. 1

Finite Memory and Imperfect Monitoring

1 Solutions to Homework 4

REPUTATION WITH LONG RUN PLAYERS

Game Theory for Wireless Engineers Chapter 3, 4

Introduction to Multi-Agent Programming

Preliminary Notions in Game Theory

Game Theory Fall 2006

EC487 Advanced Microeconomics, Part I: Lecture 9

S 2,2-1, x c C x r, 1 0,0

10 The Analytics of Human Sociality

The Limits of Reciprocal Altruism

Regret Minimization and Security Strategies

American Economic Association

In the Name of God. Sharif University of Technology. Microeconomics 2. Graduate School of Management and Economics. Dr. S.

Economics 502 April 3, 2008

Complexity and Repeated Implementation

Not 0,4 2,1. i. Show there is a perfect Bayesian equilibrium where player A chooses to play, player A chooses L, and player B chooses L.

M.l.T. LIBRARIES DEWEY

Lecture 5 Leadership and Reputation

REPUTATION WITH LONG RUN PLAYERS AND IMPERFECT OBSERVATION

On Existence of Equilibria. Bayesian Allocation-Mechanisms

Renegotiation in Repeated Games with Side-Payments 1

MA200.2 Game Theory II, LSE

Bargaining and Competition Revisited Takashi Kunimoto and Roberto Serrano

Credible Threats, Reputation and Private Monitoring.

Early PD experiments

Elements of Economic Analysis II Lecture X: Introduction to Game Theory

CS 331: Artificial Intelligence Game Theory I. Prisoner s Dilemma

Outline for today. Stat155 Game Theory Lecture 13: General-Sum Games. General-sum games. General-sum games. Dominated pure strategies

ISSN BWPEF Uninformative Equilibrium in Uniform Price Auctions. Arup Daripa Birkbeck, University of London.

Reputation and Signaling in Asset Sales: Internet Appendix

In the Name of God. Sharif University of Technology. Graduate School of Management and Economics

Game theory and applications: Lecture 1

Microeconomics II. CIDE, MsC Economics. List of Problems

Lecture 3 Representation of Games

Tilburg University. Moral hazard and private monitoring Bhaskar, V.; van Damme, Eric. Published in: Journal of Economic Theory

Rational Behaviour and Strategy Construction in Infinite Multiplayer Games

Transcription:

Repeated Games with Perfect Monitoring Mihai Manea MIT

Repeated Games normal-form stage game G = (N, A, u) players simultaneously play game G at time t = 0, 1,... at each date t, players observe all past actions: h t = (a 0,..., a t 1 ) common discount factor δ (0, 1) payoffs in the repeated game RG(δ) for h = (a 0, a 1,...): U i (h) = (1 δ) t=0 δt u i (a t ) normalizing factor 1 δ ensures payoffs in RG(δ) and G are on same scale behavior strategy σ i for i N specifies σ i (h t ) (A i ) for every history h t Can check if σ constitutes an SPE using the single-deviation principle. Mihai Manea (MIT) Repeated Games March 30, 2016 2 / 13

Minmax Minmax payoff of player i: lowest payoff his opponents can hold him down to if he anticipates their actions, [ ] v i = min α i j i (A j ) max u i (a i, α i ) a i A i m i : minmax profile for i, an action profile (a i, α i ) that solves this minimization/maximization problem assumes independent mixing by i s opponents important to consider mixed, not just pure, actions for i s opponents: in the matching pennies game the minmax when only pure actions are allowed for the opponent is 1, while the actual minmax, involving mixed strategies, is 0 Mihai Manea (MIT) Repeated Games March 30, 2016 3 / 13

Equilibrium Payoff Bounds In any SPE in fact, any Nash equilibrium i s obtains at least his minmax payoff: can myopically best-respond to opponents actions (known in equilibrium) in each period separately. Not true if players condition actions on correlated private information! A payoff vector v R N is individually rational if v i v i for each i N, and strictly individually rational if the inequality is strict for all i. Mihai Manea (MIT) Repeated Games March 30, 2016 4 / 13

Feasible Payoffs Set of feasible payoffs: convex hull of {u(a) a A}. For a common discount factor δ, normalized payoffs in RG(δ) belong to the feasible set. Set of feasible payoffs includes payoffs not obtainable in the stage game using mixed strategies... some payoffs require correlation among players actions (e.g., battle of the sexes). Public randomization device produces a publicly observed signal ω t [0, 1], uniformly distributed and independent across periods. Players can condition their actions on the signal (formally, part of history). Public randomization provides a convenient way to convexify the set of possible (equilibrium) payoff vectors: given strategies generating payoffs v and v, any convex combination can be realized by playing the strategy generating v conditional on some first-period realizations of the device and v otherwise. Mihai Manea (MIT) Repeated Games March 30, 2016 5 / 13

Nash Threat Folk Theorem Theorem 1 (Friedman 1971) If e is the payoff vector of some Nash equilibrium of G and v is a feasible payoff vector with v i > e i for each i, then for all sufficiently high δ, RG(δ) has SPE with payoffs v. Proof. Specify that players play an action profile that yields payoffs v (using the public randomization device to correlate actions if necessary), and revert to the static Nash equilibrium permanently if anyone has ever deviated. When δ is high enough, the threat of reverting to Nash is severe enough to deter anyone from deviating. If there is a Nash equilibrium that gives everyone their minmax payoff (e.g., prisoner s dilemma), then every strictly individually rational and feasible payoff vector is obtainable in SPE. Mihai Manea (MIT) Repeated Games March 30, 2016 6 / 13

General Folk Theorem Minmax strategies often do not constitute static Nash equilibria. To construct SPEs in which i obtains a payoff close to v i, need to threaten to punish i for deviations with even lower continuation payoffs. Holding i s payoff down to v i may require other players to suffer while implementing the punishment. Need to provide incentives for the punishers... impossible if punisher and deviator have indetical payoffs. Theorem 2 (Fudenberg and Maskin 1986) Suppose the set of feasible payoffs has full dimension N. Then for any feasible and strictly individually rational payoff vector v, there exists δ such that whenever δ > δ, there exists an SPE of RG(δ) with payoffs v. Abreu, Dutta, and Smith (1994) relax the full-dimensionality condition: only need that no two players have the same payoff function (equivalent under affine transformation). Mihai Manea (MIT) Repeated Games March 30, 2016 7 / 13

Proof Elements Assume first that i s minmax action profile m i is pure. Consider an action profile a for which u(a) = v (or a distribution over actions that achieves v using public randomization). By full-dimensionality, there exists v in the feasible individually rational set with v i < v i < v i for each i. Let w i be v with ε added to each player s payoff except for i; for small ε, w i is a feasible payoff. Mihai Manea (MIT) Repeated Games March 30, 2016 8 / 13

Equilibrium Regimes Phase I: play a as long as there are no deviations. If i deviates, switch to II i. Phase II i : play m i for T periods. If player j deviates, switch to II j. If there are no deviations, play switches to III i after T periods. If several players deviate simultaneously, arbitrarily choose a j among them. If m i is a pure strategy profile, it is clear what it means for j to deviate. If it requires mixing... discuss at end of the proof. T independent of δ (to be determined). Phase III i : play the action profile leading to payoffs w i forever. If j deviates, go to II j. SPE? Use the single-shot deviation principle: calculate player i s payoff from complying with prescribed strategies and check for profitable deviations at every stage of each phase. Mihai Manea (MIT) Repeated Games March 30, 2016 9 / 13

Deviations from I and II Player i s incentives Phase I: deviating yields at most (1 δ)m + δ(1 δ T )v i + δ T+1 v i, where M is an upper bound on i s feasible payoffs, and complying yields v i. For fixed T, if δ is sufficiently close to 1, complying produces a higher payoff than deviating, since v i < v i. Phase II i : suppose there are T T remaining periods in this phase. Then complying gives i a payoff of (1 δ T )v i + δ T v i, whereas deviating can t help in the current period since i is being minmaxed and leads to T more periods of punishment, for a total payoff of at most (1 δ T+1 )v i + δ T+1 v. Thus deviating is worse than complying. i Phase II j : with T remaining periods, i gets (1 δ T )u i (m j ) + δ T (v + ε) from complying and at most i (1 δ)m + (δ δ T+1 )v i + δ T+1 v from deviating. For high δ, i complying is preferred. Mihai Manea (MIT) Repeated Games March 30, 2016 10 / 13

Deviations from III Player i s incentives Phase III i : determines choice of T. By following the prescribed strategies, i receives v in every period. A (one-shot) deviation leaves i i with at most (1 δ)m + δ(1 δ T )v i + δ T+1 v i. Rearranging, i compares between (δ + δ 2 +... + δ T )(v v i i ) and M v. For any i δ (0, 1), T s.t. former term is grater than latter for δ > δ. Phase III j : Player i obtains v i + ε forever if he complies with the prescribed strategies. A deviation by i triggers phase II i, which yields at most (1 δ)m + δ(1 δ T )v i + δ T+1 v i for i. Again, for sufficiently large δ, complying is preferred. Mihai Manea (MIT) Repeated Games March 30, 2016 11 / 13

Mixed Minmax What if minmax strategies are mixed? Punishers may not be indifferent between the actions in the support... need to provide incentives for mixing in phase II. Change phase III strategies so that during phase II j player i is indifferent among all possible sequences of T realizations of his prescribed mixed action under m j. Make the reward ε i of phase III j dependent on the history of phase II j play. Mihai Manea (MIT) Repeated Games March 30, 2016 12 / 13

Dispensing with Public Randomization Sorin (1986) shows that for high δ we can obtain any convex combination of stage game payoffs as a normalized discounted value of a deterministic path (u(a t ))... time averaging Fudenberg and Maskin (1991): can dispense of the public randomization device for high δ, while preserving incentives, by appropriate choice of which periods to play each pure action profile involved in any given convex combination. Idea is to stay within ε 2 of target payoffs at all stages. Mihai Manea (MIT) Repeated Games March 30, 2016 13 / 13

MIT OpenCourseWare https://ocw.mit.edu 14.16 Strategy and Information Spring 2016 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.