Algorithmic Game Theory and Applications. Lecture 11: Games of Perfect Information

Similar documents
CS188 Spring 2012 Section 4: Games

Best response cycles in perfect information games

Microeconomics of Banking: Lecture 5

Finding Equilibria in Games of No Chance

Algorithms and Networking for Computer Games

Lecture Note Set 3 3 N-PERSON GAMES. IE675 Game Theory. Wayne F. Bialas 1 Monday, March 10, N-Person Games in Strategic Form

CS221 / Spring 2018 / Sadigh. Lecture 9: Games I

Lecture 9: Games I. Course plan. A simple game. Roadmap. Machine learning. Example: game 1

Rational Behaviour and Strategy Construction in Infinite Multiplayer Games

Yao s Minimax Principle

CS360 Homework 14 Solution

Q1. [?? pts] Search Traces

CEC login. Student Details Name SOLUTIONS

TR : Knowledge-Based Rational Decisions and Nash Paths

Math 167: Mathematical Game Theory Instructor: Alpár R. Mészáros

6.254 : Game Theory with Engineering Applications Lecture 3: Strategic Form Games - Solution Concepts

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class

CMSC 474, Introduction to Game Theory 16. Behavioral vs. Mixed Strategies

Stochastic Games and Bayesian Games

M.Phil. Game theory: Problem set II. These problems are designed for discussions in the classes of Week 8 of Michaelmas term. 1

Lecture 4: Divide and Conquer

Game theory for. Leonardo Badia.

CSE 21 Winter 2016 Homework 6 Due: Wednesday, May 11, 2016 at 11:59pm. Instructions

Lecture 6 Dynamic games with imperfect information

Strategies and Nash Equilibrium. A Whirlwind Tour of Game Theory

Lecture 7: Bayesian approach to MAB - Gittins index

Introduction to Game Theory Lecture Note 5: Repeated Games

Outline Introduction Game Representations Reductions Solution Concepts. Game Theory. Enrico Franchi. May 19, 2010

Tug of War Game. William Gasarch and Nick Sovich and Paul Zimand. October 6, Abstract

Stochastic Games and Bayesian Games

Announcements. Today s Menu

In reality; some cases of prisoner s dilemma end in cooperation. Game Theory Dr. F. Fatemi Page 219

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

Logic and Artificial Intelligence Lecture 24

TR : Knowledge-Based Rational Decisions

Lecture l(x) 1. (1) x X

FDPE Microeconomics 3 Spring 2017 Pauli Murto TA: Tsz-Ning Wong (These solution hints are based on Julia Salmi s solution hints for Spring 2015.

February 23, An Application in Industrial Organization

Subgame Perfect Cooperation in an Extensive Game

Lecture 5: Tuesday, January 27, Peterson s Algorithm satisfies the No Starvation property (Theorem 1)

An effective perfect-set theorem

Maximum Contiguous Subsequences

Extensive form games - contd

1 Solutions to Tute09

Epistemic Game Theory

Chapter 10: Mixed strategies Nash equilibria, reaction curves and the equality of payoffs theorem

Semantics with Applications 2b. Structural Operational Semantics

Repeated Games. EC202 Lectures IX & X. Francesco Nava. January London School of Economics. Nava (LSE) EC202 Lectures IX & X Jan / 16

CHAPTER 14: REPEATED PRISONER S DILEMMA

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Sequential Rationality and Weak Perfect Bayesian Equilibrium

CUR 412: Game Theory and its Applications, Lecture 9

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

CS 188 Fall Introduction to Artificial Intelligence Midterm 1. ˆ You have approximately 2 hours and 50 minutes.

Using the Maximin Principle

Ch 10 Trees. Introduction to Trees. Tree Representations. Binary Tree Nodes. Tree Traversals. Binary Search Trees

Laws of probabilities in efficient markets

> asympt( ln( n! ), n ); n 360n n

Finitely repeated simultaneous move game.

ECE 586GT: Problem Set 1: Problems and Solutions Analysis of static games

On the Optimality of a Family of Binary Trees Techical Report TR

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Finite Memory and Imperfect Monitoring

Duopoly models Multistage games with observed actions Subgame perfect equilibrium Extensive form of a game Two-stage prisoner s dilemma

Non replication of options

Notes for Section: Week 4

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program August 2017

Structural Induction

CUR 412: Game Theory and its Applications Final Exam Ronaldo Carpio Jan. 13, 2015

MATH 121 GAME THEORY REVIEW

Outline of Lecture 1. Martin-Löf tests and martingales

MA200.2 Game Theory II, LSE

An Optimal Algorithm for Calculating the Profit in the Coins in a Row Game

MA300.2 Game Theory 2005, LSE

BAYESIAN GAMES: GAMES OF INCOMPLETE INFORMATION

CS711 Game Theory and Mechanism Design

Non-Deterministic Search

TEST 1 SOLUTIONS MATH 1002

Competitive Outcomes, Endogenous Firm Formation and the Aspiration Core

CS 188: Artificial Intelligence

1 Solutions to Homework 3

Introduction to Greedy Algorithms: Huffman Codes

G5212: Game Theory. Mark Dean. Spring 2017

Extensive-Form Games with Imperfect Information

MS&E 246: Lecture 5 Efficiency and fairness. Ramesh Johari

Levin Reduction and Parsimonious Reductions

EXTENSIVE AND NORMAL FORM GAMES

While the story has been different in each case, fundamentally, we ve maintained:

Outline. Objective. Previous Results Our Results Discussion Current Research. 1 Motivation. 2 Model. 3 Results

Sequential allocation of indivisible goods

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Introduction to game theory LECTURE 2

Problem 3 Solutions. l 3 r, 1

Computing Unsatisfiable k-sat Instances with Few Occurrences per Variable

A relation on 132-avoiding permutation patterns

Algorithmic Game Theory (a primer) Depth Qualifying Exam for Ashish Rastogi (Ph.D. candidate)

The exam is closed book, closed calculator, and closed notes except your three crib sheets.

Equilibrium selection and consistency Norde, Henk; Potters, J.A.M.; Reijnierse, Hans; Vermeulen, D.

Hierarchical Exchange Rules and the Core in. Indivisible Objects Allocation

Chapter 16. Binary Search Trees (BSTs)

Transcription:

Algorithmic Game Theory and Applications Lecture 11: Games of Perfect Information Kousha Etessami

finite games of perfect information Recall, a perfect information (PI) game has only 1 node per information set. Theorem([Kuhn 53]) Every finite n-person extensive PI-game, G, has a NE, in fact, a subgame-perfect NE (SPNE), in pure strategies. I.e., some pure profile, s = (s 1,..., s n), is a SPNE. Before proving this, we need some definitions. For a game G with game tree T, and for w T, define the subtree T w T, by: T w = {w T w = ww for w Σ } Since the game tree is finite, we can just associate payoffs to the leaves. Thus, the subtee T w, in an obvious way, defines a subgame, G w, which is also a PI-game. Let the depth of a node w in T be its length w (as a string). Let the depth of a tree T be the maximum depth of any node in T. Let the depth of a game G be the depth of its game tree. 1

proof of Kuhn s theorem (easy) 2 Proof We prove by induction on the depth of a subgame G w that it has a pure strategy SPNE, s w = (s w 1,..., s w n ). Then s := s ɛ. Base case, depth 0: In this case we are at a leaf w. there is nothing to show: each player i gets payoff u i (w), and the strategies in the SPNE s are empty (it doesn t matter which player s node w is, since there are no actions to take.) Inductive step: Suppose depth of G w is k + 1. Let Act(w) = {a 1,..., a r} be the set of actions available at the root of G w. The subtrees T wa j, for j = 1,..., r, each define a PI-subgame G wa j, of depth k. Thus, by induction, each game G wa j strategy SPNE, s wa j = (s wa j 1,..., s wa j n ). has a pure To define s w = (s w 1,..., s w n ), there are two cases to consider...

two cases 1. pl(w) = 0, i.e., the root node, w, of T w is a chance node (belongs to nature ). Let the strategy s w i for player i be just the obvious union a Act(w) swa i, of its pure strategies in each of the subgames. (Explanation of union of disjoint strategy functions.) Claim: s w = (s w 1,..., s w n ) is a pure SPNE of G w. Suppose not. Then some player i could improve its expected payoff by switching to a different pure strategy in one of the subgames. But that violates the inductive hypothesis on that subgame. 2. pl(w) = i > 0. In this case the root node, w, of T w belongs to player i. For a Act(w), let h wa i (s wa ) be the expected payoff to player i in the subgame G wa. For some a, h wa i (s wa ) = max hwa a Act(w) i (s wa ). For players i i, define s w i = a Act(w) swa i. For i, define s w i = ( a Act(w) swa i ) {w a }. Claim: s w = (s w 1,..., s w n ) is a pure SPNE of G w. The proof is basically the same argument as the previous case. Even for the root player, i, to (unilaterally) improve its payoff, it would need to do so in some (strict) subgame. 3

easy bottom-up algorithm for SPNE s in finite PI-games The proof yields an EASY bottom up algorithm for computing a pure SPNE in a finite PI-game: We inductively attach to the root of every subtree T w, a SPNE s w for the game G w, together with the expected payoff vector h w := (h w 1 (s w ),..., h w n (s w )). 1. Initially: Attach to each leaf w the empty profile s w = (,..., ), & payoff vector h w := (u 1 (w),..., u n (w)). 2. While (there is an unattached node w all of whose children are attached) if (pl(w) = 0) then s w := (s w 1,..., s w n ), where s w i := a Act(w) swa i ; note, h w is given by: h w i (sw ) := a Act(w) q w(a) h wa i (s wa ) ; else if (pl(w) = i > 0) then Let s w := (s w 1,..., s w n ), & h w := h wa, where s w i := a Act(w) swa i, for i i, s w i := ( a Act(w) swa i ) {w a }, and where a Act(w) such that h wa i (s wa ) = max hwa a Act(w) i (s wa ) 4

5 consequences for zero-sum finite PI-games Recall that, by the Minimax Theorem, for every finite zero-sum game Γ, there is a value v such that for any NE (x 1, x 2) of Γ, v = U(x 1, x 2), and max x 1 X 1 min x 2 X 2 U(x 1, x 2 ) = v = min x 2 X 2 max x 1 X 1 U(x 1, x 2 ) But it follows from Kuhn s theorem that for extensive PI-games G there is in fact a pure NE (in fact, SPNE) (s 1, s 2) such that v = u(s 1, s 2) := h(s 1, s 2), and thus that in fact max min u(s 1, s 2 ) = v = min max u(s 1, s 2 ) s 1 S 1 s 2 S 2 s 2 S 2 s 1 S 1 Definition We say a finite zero-sum game Γ is determined or has a value, precisely if max min u(s 1, s 2 ) = min max u(s 1, s 2 ) s 1 S 1 s 2 S 2 s 2 S 2 s 1 S 1 It thus follows from Kuhn s theorem that: Proposition ([Zermelo 1912]) Every finite zero-sum PI-game, G, is determined (has a value). Moreover, the value & a pure minimax profile can be computed efficiently from G.

chess Chess is a finite PI-game (after 50 moves with no piece taken, it ends in a draw). In fact, it s a win-lose-draw PI-game: there are no chance nodes and the only possible payoffs u(s 1, s 2 ) are 1, 1, and 0. Proposition([Zermelo 1912]) In Chess, either 1. White has a winning strategy, or 2. Black has a winning strategy, or 3. Both players have strategies to force a draw. A winning strategy, e.g., for White (Player 1) is a pure strategy s 1 that guarantees value u(s 1, s 2 ) = 1. 6 Question: Which one is the right answer?? Problem: The tree is far too big!! Even with 200 depth & 5 moves per node: 5 200 nodes! Despite having an efficient algorithm to compute the value v given the tree, we can t even look at the whole tree! We need algorithms that don t look at the whole tree.

50 years of game-tree search There s > 50 years of research on chess & other game playing programs, (Shannon, Turing,...). Heuristic game-tree search procedures are now very refined. See any AI text (e.g., [Russel-Norvig 02]). We can do a recursive top-down search in Depth- First style. If additionally we have a function Eval(w) that heuristically evaluates a node s goodness score, we can use Eval(w) to stop the search at, e.g., any desired depth. While searching top-down, we can prune out irrelevant subtrees using α-β-pruning. Idea: while searching the minmax tree, maintain two values: α- maximizer can assure a score of at least α ; β- minimizer can assure a score of at most β ; 7 L R Player I: Player II: L R L R L R L R L R L R 5 2 4 1 2 3 7 1

minmax search with α-β-pruning Assume, for simplicity, that players alternate moves, and root belongs to Player 1 (maximizer). Assume 1 Eval(w) +1. Score 1 (+1) means player 1 definitely loses (wins). Start the search by calling: MaxVal(ɛ, 1, +1); MaxVal(w, α, β) If depth(w) M axdepth then return Eval(w). Else for each a Act(w) α := max{α, MinVal(wa, α, β)}; if α β, then return β return α MinVal(w, α, β) If depth(w) M axdepth, then return Eval(w). Else for each a Act(w) β := min{β, MaxVal(wa, α, β)}; if β α, then return α return β Heuristic game-tree search is a world of research onto itself. Please see, e.g., the references in AI texts. 8

9 boolean circuits as finite extensive PI-games Boolean circuits can be viewed as a zero-sum PIgame, between AND and OR: OR the maximizer, AND the minimizer. (negations pushed down to bottom.) This is a win-lose PI-game: no chance nodes and the only possible payoffs u(s 1, s 2 ) are 1 and 1. 0 1 L R Player I: Player II: L R L R L L R L L R 1 1 1 1 1 1 Note: game tree can be exponentially bigger, but efficient bottom-up algorithm works directly on circuit.

10 Ok, let s try to generalize to infinite zero-sum games For a (possibly infinite) zero-sum 2-player PI-game, we would like to similarly define the game to be determined if max s 1 S 1 min s 2 S 2 u(s 1, s 2 ) = min s 2 S 2 max s 1 S 1 u(s 1, s 2 ) But, for infinite games max & min may not exist! Instead, let s define an (infinite) zero-sum game to be determined (have a value) if: sup s 1 S 1 inf u(s 1, s 2 ) = inf s 2 S 2 s 2 S 2 sup s 1 S 1 u(s 1, s 2 ) In the simple setting of infinite win-lose PI-games (2 players, zero-sum, no chance nodes, and only payoffs are 1 and 1), this definition says a game is determined precisely when one player or the other has a winning strategy: a strategy s 1 S 1 such that for any s 2 S 2, u(s 1, s 2 ) = 1 (and vice versa for player 2). Question: Is every win-lose PI-game determined? Answer: No...

determinacy and its boundaries For win-lose PI-games, we can define the payoff function by providing the set Y = u 1 1 (1) Ψ T, of complete plays on which player 1 wins (player 2 necessarily wins on all other plays). If, additionally, we assume that players alternate moves, we can specify such a game as G = T, Y. Fact Even for the binary tree T = {L, R}, there are sets Y Ψ T, such that the win-lose PI-game G = T, Y is not determined. (This fact is not very easy to show. Its proof uses the axiom of choice. See, e.g., [Mycielski, Ch. 3 of Handbook of GT,1992].) Fortunately, large classes of win-lose PI-games are determined. In particular: Theorem([D. A. Martin 75]) Whenever Y is a so called Borel set, the game Σ, Y is determined. (This is a deep theorem, with deep connections to logic and set theory. The theorem holds even when the action alphabet Σ is an infinite set. ) 11

12 food for thought: win-lose games on finite graphs Instead of a tree, we have a finite directed graph: Start Player I: Player II: Goal Starting at Start, does Player I have a strategy to force the play to reach the Goal? Convince yourself that this is a (possibly infinite) win-lose PI-game. Is this game determined for all finite graphs? If so, how would you compute a winning strategy for Player 1?