Outline Introduction Game Representations Reductions Solution Concepts. Game Theory. Enrico Franchi. May 19, PDF Free Download

May 19, 2010

1 Introduction Scope of Agent preferences Utility Functions 2 Game Representations Example: Game-1 Extended Form Strategic Form Equivalences 3 Reductions Best Response Domination 4 Solution Concepts Nash Equilibrium Pareto Optimality Examples Importance of Nash Equilibria

Game theory can be defined as the study of mathematical models of conflict and cooperation between intelligent rational decision-makers. (R. Myerson) A decision-maker is rational if he makes decisions consistently in pursuit of his own objectives A player is intelligent if he knows everything that we know about the game and he can make inferences about the situation that we can make

Utility The concept of utility is grounded in the concept of preference Let O denote a finite set of outcomes. For any o 1, o 2 O: o 1 o 2 if the agent weakly prefers o 1 to o 2 o 1 o 2 if the agent is indifferent between o 1 and o 2 o 1 o 2 if the agent strongly prefers o 1 to o 2 A lottery is a probability distribution [p 1 : o 1,..., p k : o k ] where each o i O, each p i [0, 1] and k i=1 p 1 = 1 We extend to lotteries.

Preference relation axioms Completeness o 1, o 2 O o 1 o 2 or o 2 o 1 or o 1 o 2. Transitivity If o 1 o 2 and o 2 o 3, then o 1 o 3. Substitutability If o 1 o 2 then for all sequences of one or more outcomes o 3,..., o k and sets of probabilities p, p 3,..., p k for which p + k i=3 p i = 1 [p : o 1, p 3 : o 3,... p k : o k ] = [p : o 2, p 3 : o 3,... p k : o k ] Decomposability If o i O, P l1 (o i ) = P l2 (o i ) then l 1 l 2 Monotonicity If o 1 o 2 and p > q, then [p : o 1, 1 p : o 2 ] [q : o 1, 1 q : o 2 ] Continuity If o 1 o 2 and o 2 o 3, then p [0, 1] such that o 2 [p : o 1, 1 p : o 3 ].

Expected Utility Theorem Theorem (von Neumann-Morgenstern, 1944) If a preference relation satisfies the axioms completeness, transitivity, substitutability, decomposability, monotonicity and continuity, then there exists a function u : O [0, 1] satisfying: u(o 1 ) > u(o 2 ) iff o 1 p 2 (1) k u([p 1 : o 1,..., p k : o k ]) = p i u(o i ) (2) i=1 Proof. See R. Myerson or Y. Shoham&K. Leyton-Brown.

Description of Game-1 At the beginning of the game, player 1 and player 2 put 1 euro in the pot Player 1 draws a card from a shuffled deck of cards in which half the cards are red and half are black Player 1 looks the card privately and decides whether to raise or fold If he folds, the game ends. The money goes to 1 if the card is red or to 2 if it is black If he raises, he puts another euro in the pot and player 2 decides whether to meet or pass If player 2 passes, game ends and money goes to player 1 If player 2 meets, he adds another euro to the pot and game ends. Player 1 takes the money if the card is red and player 2 takes the money if it is black

An (in?)adequate graph of Game-1 1 raise 2 meet (2,-2) fold pass.5 (red) (1,-1) (1,-1) 0 (black).5 1 raise 2 meet (-2,2) pass fold (1,-1) (-1,1)

Comments on the graph of Game-1 Each terminal node reports utilities for both players Nodes have a label: 0 for chance nodes, i for nodes played by the i-th player Chance nodes edges are labelled with the probability of that edge to be taken Player edges are labelled with the action the player chooses Lacks information. E.g., from the graph, we do not understand player 2 does not know the color of the card If the game is complex, the representation is very unpractical

Extended Form of Game-1 1.a raise 2.0 meet (2,-2) fold pass.5 (red) (1,-1) (1,-1) 0 (black).5 1.b raise 2.0 meet (-2,2) pass fold (1,-1) (-1,1)

Description of the Extended Form 0.5 (black) 1.a raise 2.0 (red).5 fold (1,-1) 1.b raise 2.0 fold (-1,1) meet pass meet pass (2,-2) (1,-1) (-2,2) (1,-1) Each nonterminal node has a player label 1... n. Nodes assigned a player label 0 are called chance nodes. N = {1,..., n} is the set of players in the game. Nodes labelled with i are decision nodes for player i Every alternative at a chance node has a label that specifies its probability. The sum of the probabilities sum to 1 Every node that is controlled by a player has a second label that specifies the information state that the player would have if the path of the play reached this node. The player knows only what specified by this label

Description of the Extended Form 0.5 (black) 1.a raise 2.0 (red).5 fold (1,-1) 1.b raise 2.0 fold (-1,1) meet pass meet pass (2,-2) (1,-1) (-2,2) (1,-1) Each alternative at a node that is controlled by a player has a move label. For any two nodes x and y with the same player label and the same information label, and for any alternative at node x there must be exactly one alternative at node y that has the same node label Each terminal node has a label that specifies a vector of n numbers (u 1,..., u n ). These are the payoff to player i in some outcome of the game

On extended form (again...) If two nodes belong to the same player and have the same information state, the player is unable to distinguish them during the game It is also true that if two nodes belong to the same player and he is unable to distinguish them, then they should have the same information state A game satisfies perfect recall if whenever a player moves, he remembers all the information that he knew earlier in the game, including all his past moves. We often assume this condition to hold A game has perfect information if no two nodes have the same information state. This entails that each player exactly knows where he is in the graph

Strategies Definition (Strategy) A strategy for a player in an extensive-form game is any rule for determining a move at every possibile information state in the game Let S i be the set of all information states for player i Let s S i, D s is the set of moves available to i when he moved in a node with information state s A strategy c i is a function mapping an information state to an action available to i in that information state With an abuse of notation, the set of strategies for i is s Si D s (why is this acceptable?) A strategy is a complete rule that specifies a move for the player for all possible cases, even though only one will arise

Example 0.5 (red) 1.a raise 2.0 fold (1,-1) meet pass (2,-2) (1,-1) Strategies for player 1: {a : raise, b : raise}, {a : fold, b : raise}, {a : raise, b : fold}, (black).5 1.b raise 2.0 meet (-2,2) {a : fold, b : fold} fold (-1,1) pass (1,-1) Strategies for player 2: pass, meet

A game 1.1 T B 2.2 2.2 L R L R (2,2) (4,0) (1,0) (3,1) Player 2 does not observe player 1 s move Player 1 would be better off choosing T against both player 2 s strategies

Another game Player 2 observes 1 s actual choice before choosing Player 1 can influence player 2 1.1 T B 2.2 2.3 L1 R1 L2 R2 (2,2) (4,0) (1,0) (3,1) L1 is the best response to T, R2 is the best response to B Strategies for player 2 are: {T : L1, B : L2}, {T : R1, B : L2}, {T : L1, B : R2}, {T : R1, B : R2} If the game is played only once, we could not discover player 2 strategy from his move

The Battle of Sexes (Example) Example (The Battle of Sexes) A wife and a husband have to decide what to do the evening. They both want to go to the cinema, however, they disagree on which movie to see. The husband prefers to see Lethal Weapon (LW), while the wife prefers to see Wondrous Love (WL). Luckily, they both agree that going to the cinema alone is far worse than seeing an uninteresting movie. Unluckily, they are a strange couple and both are very stubborn. Once they have decided where they intend to go, they won t change their minds, no matter what the partner does. They just hope to guess the right choice, since they have no idea what their parter has in her/his mind.

Battle of Sexes WL (3,1) 2.b LW WL (0,0) 1.a LW 2.b WL (0,0) LW (1,3)

Battle of Sexes (another point of view...) WL (3,1) 1.a LW WL (0,0) 2.b LW 1.a WL (0,0) LW (1,3)

Games in strategic form Definition (Strategic Form) A game in strategic form is a tuple Γ: Γ = (N, (C i ) i N, (u i ) i N ) (3) N is a nonempty set of players (usually finite) For each i N, C i is the nonempty set of (pure) strategies available to i A strategy profile is a combination of strategies C = j N C j is the set of strategy profiles For each i, u i : C R is a utility function specifying the payoff for player i

Conversion to Strategic Form Given a game in extended form, automated procedures to convert it in strategic form exist von Neumann and Morgenstern procedure to get the normal representation in strategic form of an extended form game: N is the same set of players of the extended form For any i N, C i is the same set of strategies available to i in the extended form Let w i (x) the payoff for player i in terminal node x in the extended form. Let Ω the set of terminal nodes in the extended form game. The utility of a strategy profile c, for player i is: u i (c) = P(x c)w i (x) x Ω

Probability P(x c) of a node If x is the root of the game, P(x c) = 1 If x immediately follows a chance node y and q is the chance probability associated to the branch from y to x, then P(x c) = qp(y c) If x immediately follows a player node y belonging to player i in the information state r, then P(x c) = P(y c) if c i (r) is the move label on the alternative from y to x and P(x c) = 0 otherwise.

Card game in strategic form 1.a raise 2.0 meet (2,-2) 0.5 (black) (red).5 fold (1,-1) 1.b raise 2.0 pass meet pass (1,-1) (-2,2) C 2 C 1 meet pass {a : raise, b : raise} 0,0 1, -1 {a : raise, b : fold} 0.5, -0.5 0, 0 {a : fold, b : raise} -0.5, 0.5 1, -1 {a : fold, b : fold} 0,0 0, 0 fold (1,-1) (-1,1)

Another game in strategic form 1.1 T B 2.2 2.2 L R L R (2,2) (4,0) (1,0) (3,1) C 2 C 1 L R T 2, 2 4, 0 B 1, 0 3, 1

Another game in strategic form 1.1 T B 2.2 2.3 L1 R1 L2 R2 (2,2) (4,0) (1,0) (3,1) C 2 C 1 L 1 L 2 L 1 R 2 R 1 L 2 R 1 R 2 T 2, 2 2, 2 4, 0 4, 0 B 1, 0 3, 1 1, 0 3,1 L 1 R 2 means player 2 answers L 1 to T and R 2 to B.

Notation N i is the set of all players except i Let C i = j N i C j Let e i = (e j ) j N i C i and d i C i, (e i, d i ) C is the strategy profile such that the i-th component is d i and all other components are in e i. (X ) is the set of probability distributions over X The set of points in X that maximize function f is: { } arg max = y X f (y) = max f (z) y X z X

Best Response Equivalence If a player i believes that some distribution η (C i ) predicts the behaviour of other players, then i chooses his strategy in C i to maximize his payoff. The set of best responses to η is: G η i (C i ) = arg max η(e i )u I (e i, d i ) d i C i e i C i Two games Γ and ˆΓ are best-response equivalent iff the set of best responses for every player and every possible probability distribution over the others strategies are the same

Payoff equivalence Two strategies e i and d i are payoff equivalent iff c i C i, j N u j (c i, d i ) = u j (c i, e i ) Payoff equivalent strategies can be merged and substituted with a single strategy If all payoff-equivalent strategies are substituted, we say the game in in purely reduced normal representation

Randomized Strategies Definition A randomized strategy for player i is any probability distribution over the set of C i. (C i ) is the set of randomized strategies for player i. Conversely, strategies in C i are called pure strategies. Let σ i C i, σ i (c i ) is the probability that i plays c i is he is implementing σ i. A strategy d i is randomly redundant iff there is σ i (C i ) such that σ i (d i ) = 0 and u j (c i, d i ) = σ i (e i )u j (c i, e i ) c i C i, j N e i C i The [fully] reduced normal representation is derived from the purely reduced normal representation by eliminating all randomly redundant strategies.

Domination Definition (Domination) Let Γ = (N, (C i ) i N, (u i ) i N ) a game in strategic-form. A strategy d i C i is strongly dominated for player i iff there exists some randomized strategy σ i (C i ) such that: u i (c i, d i ) < e i C i σ(e i )u i (c i, e i ) i C i (4)

Elimination of Dominated Strategies It can be proved that if d i is strongly dominated for player i, than d i is never a best response, no matter what he believes about other players strategies. Removing dominated strategies does not change the analysis of the game It is possible that a previously not strongly dominated strategy becomes strongly dominated after removal of some strongly dominated strategy The order in which dominated strategies are removed does not matter: it is possible to develop an algorithm to eliminate all strongly dominated strategies It is better not to remove simply dominated strategies (defined as strongly dominated, with instead of <)

Elimination of Dominated Strategies Let Γ = (N, (C i ) i N, (u i ) i N ) a strategic form game and let C (1) i denote the set of all strategies in C i that are not strongly dominated for i Consider Γ (1) = (N, (C (1) i ) i N, (u i ) i N ) By induction, for every positive integer k we can define the strategic-form game Γ (k) = (N, (C (k) i ) i N, (u i ) i N ), where C (k) i is the set of all strategies in C (k 1) i that are not strongly dominated for i in Γ (k 1) C i C (1) i C (2) i... i N Since every set is finite, there is K such that C (K) i = C (K+1) i = C (K+2) i =... i N

Residual Game Definition Given the number K, we let Γ ( ) = Γ (K) and C ( ) i = C i (K) for every player i. The strategies in C ( ) i are iteratively undominated in the strong sense. Γ ( ) is called the residual game generated from Γ by iterative strong domination. Since every player is rational, no player would use a dominated strategy. So he would choose in C (1) i. Moreover, since he is intelligent, he knows no player j would choose a strategy outside C (1) j. As a consequence, he should choose a strategy in C (2) i. Repeatedly using the assumptions of rationality and intelligence, it is clear that only iteratively undominated strategies are played.

Notation Let σ i N (C i ) a randomized strategy profile We define: u i (σ) = σ j (c j ) u i (c) c i C j N i u i (σ i, τ i ) = σ j (c j ) τ i (c i )u i (c) c i C j N i We denote with [d i ] the randomized strategy which gives probability 1 to d i Consider σ i = c i C i σ i (c i )[c i ] u i (σ i, [d i ]) = σ j (c j ) u i (c i, d i ) j N i c i C i

Nash Equilibrium Definition (Nash Equilibrium) A randomized-strategy profile σ is a Nash equilibrium of Γ iff it satisfies condition: σ i (c i ) > 0 c i arg max u i (σ i, [d i ]) (5) d i C i for every player i and every c i C i Lemma A condition equivalent to Equation (5) is that: u i (σ) u i (σ i, τ i ) (6) for every player i and for every τ i (C i ).

Nash Theorem Theorem (Nash Theorem) Given any finite game Γ in strategic form, there exists at least one equilibrium in i N (C i ).

Pareto Optimality A game may have multiple equilibria Some equilibria are inefficient An outcome is (weakly) Pareto Efficient iff there is no other outcome that would make all players better off

The Prisoner s Dilemma C 2 C 1 g 2 f 2 g 1 5,5 0,6 f 1 6,0 1,1

Battle of Sexes C 2 C 1 LW WL LW 3,1 0,0 WL 0,0 1,3

Importance of Nash Equilibria We are trying to predict the behaviour of rational intelligent players Their behaviour should tend to equilibrium because otherwise they would change their strategies Equilibria should be self-fulfilling prophecies The focal-point effect, equity and efficiency

Outline Introduction Game Representations Reductions Solution Concepts. Game Theory. Enrico Franchi. May 19, 2010