The Optimality of Regret Matching

Size: px

Start display at page:

Download "The Optimality of Regret Matching"

Blaze Berry
5 years ago
Views:

1 The Optimality of Regret Matching Sergiu Hart July 2008 SERGIU HART c 2008 p. 1

2 THE OPTIMALITY OF REGRET MATCHING Sergiu Hart Center for the Study of Rationality Dept of Economics Dept of Mathematics The Hebrew University of Jerusalem SERGIU HART c 2008 p. 2

3 Joint work with Elchanan Ben-Porath SERGIU HART c 2008 p. 3

4 Repeated Games In a repeated game: SERGIU HART c 2008 p. 4

5 Repeated Games In a repeated game: If the opponent plays i.i.d. (independently identically distributed) SERGIU HART c 2008 p. 4

6 Repeated Games In a repeated game: If the opponent plays i.i.d. (independently identically distributed) Then fictitious play is optimal SERGIU HART c 2008 p. 4

7 Repeated Games In a repeated game: If the opponent plays i.i.d. (independently identically distributed) Then fictitious play is optimal If the opponent is OBLIVIOUS SERGIU HART c 2008 p. 4

8 Repeated Games In a repeated game: If the opponent plays i.i.d. (independently identically distributed) Then fictitious play is optimal If the opponent is OBLIVIOUS OBLIVIOUS = does not see / know / use my actions SERGIU HART c 2008 p. 4

9 Repeated Games In a repeated game: If the opponent plays i.i.d. (independently identically distributed) Then fictitious play is optimal If the opponent is OBLIVIOUS Then? OBLIVIOUS = does not see / know / use my actions SERGIU HART c 2008 p. 4

10 Oblivious SERGIU HART c 2008 p. 5

11 Oblivious SERGIU HART c 2008 p. 5

12 Model: One-Shot Game SERGIU HART c 2008 p. 6

13 Model: One-Shot Game The one-shot game Γ = < I, J, u > SERGIU HART c 2008 p. 6

14 Model: One-Shot Game The one-shot game Γ = < I, J, u > I = (finite) set of actions of Player 1 J = (finite) set of actions of Player 2 (the "opponent") SERGIU HART c 2008 p. 6

15 Model: One-Shot Game The one-shot game Γ = < I, J, u > I = (finite) set of actions of Player 1 J = (finite) set of actions of Player 2 (the "opponent") u : I J R = the payoff function of Player 1 SERGIU HART c 2008 p. 6

16 Model: One-Shot Game The one-shot game Γ = < I, J, u > I = (finite) set of actions of Player 1 J = (finite) set of actions of Player 2 (the "opponent") u : I J R = the payoff function of Player 1 without loss of generality assume: 0 u(i, j) 1 for all i I, j J SERGIU HART c 2008 p. 6

17 Model: Repeated Game SERGIU HART c 2008 p. 7

18 Model: Repeated Game The repeated game Γ SERGIU HART c 2008 p. 7

19 Model: Repeated Game The repeated game Γ O Player 2 is restricted to OBLIVIOUS strategies η : t=0 Jt (J) SERGIU HART c 2008 p. 7

20 Model: Repeated Game The repeated game Γ O Player 2 is restricted to OBLIVIOUS strategies η : t=0 Jt (J) Player 1 is unrestricted σ : t=0 (I J)t (I) SERGIU HART c 2008 p. 7

21 Model: Repeated Game The repeated game Γ O Player 2 is restricted to OBLIVIOUS strategies η : t=0 Jt (J) Player 1 is unrestricted σ : t=0 (I J)t (I) Payoffs v t := u(i t, j t ) payoff at time t v T := (1/T) T t=1 v t average payoff up to time T SERGIU HART c 2008 p. 7

22 Model: Repeated Game The repeated game Γ O Player 2 is restricted to OBLIVIOUS strategies η : t=0 Jt (J) Player 1 is unrestricted σ : t=0 (I J)t (I) Without loss of generality: Player 1 uses "self-oblivious" strategies σ : t=0 Jt (I) SERGIU HART c 2008 p. 7

23 Dominance SERGIU HART c 2008 p. 8

24 Dominance If Player 1 has a dominant action i I in Γ then playing i forever is a dominant strategy in the repeated game SERGIU HART c 2008 p. 8

25 Dominance If Player 1 has a dominant action i I in Γ then playing i forever is a dominant strategy in the repeated game The one-shot game Γ is called essential SERGIU HART c 2008 p. 8

26 Dominance If Player 1 has a dominant action i I in Γ then playing i forever is a dominant strategy in the repeated game The one-shot game Γ is called essential if Player 1 does NOT have a dominant action: SERGIU HART c 2008 p. 8

27 Dominance If Player 1 has a dominant action i I in Γ then playing i forever is a dominant strategy in the repeated game The one-shot game Γ is called essential if Player 1 does NOT have a dominant action: for every i I there exists i I with u(i, j) < u(i, j) for some j J SERGIU HART c 2008 p. 8

28 Dominance Proposition. Let Γ be an essential game. SERGIU HART c 2008 p. 9

29 Dominance in Γ O Proposition. Let Γ be an essential game. Then every strategy of Player 1 in Γ O is weakly dominated. SERGIU HART c 2008 p. 9

30 Dominance in Γ O Proposition. Let Γ be an essential game. Then every strategy of Player 1 in Γ O is weakly dominated. That is, for every σ there exists ˆσ such that: For every oblivious strategy η of Player 2 Eˆσ,η [ v T ] E σ,η [ v T ] 1/T for all T 1 SERGIU HART c 2008 p. 9

31 Dominance in Γ O Proposition. Let Γ be an essential game. Then every strategy of Player 1 in Γ O is weakly dominated. That is, for every σ there exists ˆσ such that: For every oblivious strategy η of Player 2 Eˆσ,η [ v T ] E σ,η [ v T ] 1/T for all T 1 There exists an oblivious strategy η 0 of Player 2 and a constant γ > 0 such that Eˆσ,η0 [ v T ] > E σ,η0 [ v T ] + γ for all T 1 SERGIU HART c 2008 p. 9

32 Regret Matching SERGIU HART c 2008 p. 10

33 Regret Matching For every k K, let σ k : t=0 Jt (I) be a (self-oblivious, behavior) strategy of Player 1. SERGIU HART c 2008 p. 10

34 Regret Matching For every k K, let σ k : t=0 Jt (I) be a (self-oblivious, behavior) strategy of Player 1. Theorem. For every finite set K there exists a K-REGRET-MATCHING strategy σ σ K such that E σ,η [ v T ] E σk,η [ v T ] 3 K T for every k K, every time T 1, and every oblivious strategy η of Player 2. SERGIU HART c 2008 p. 10

35 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. SERGIU HART c 2008 p. 11

36 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: SERGIU HART c 2008 p. 11

37 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) SERGIU HART c 2008 p. 11

38 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: SERGIU HART c 2008 p. 11

39 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) (the total payoff of σ k against the realized sequence of actions of Player 2) SERGIU HART c 2008 p. 11

40 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) (the total payoff of σ k against the realized sequence of actions of Player 2) R(k) := [V (k) U] + (the REGRET of k) SERGIU HART c 2008 p. 11

41 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) (the total payoff of σ k against the realized sequence of actions of Player 2) R(k) := [V (k) U] + (the REGRET of k) ρ k := R(k)/ l K R(l) (the normalized regret of k) SERGIU HART c 2008 p. 11

42 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) R(k) := [V (k) U] + (the REGRET of k) ρ k := R(k)/ l K R(l) SERGIU HART c 2008 p. 11

43 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) R(k) := [V (k) U] + (the REGRET of k) ρ k := R(k)/ l K R(l) σ plays like σ k with probability ρ k SERGIU HART c 2008 p. 11

44 Regret Matching The K-REGRET-MATCHING strategy σ σ K is defined as follows. At time T + 1: U := T t=1 v t (realized total payoff) For each k K: V (k) := T t=1 u(σ k(h 2 t 1 ), j t) R(k) := [V (k) U] + (the REGRET of k) ρ k := R(k)/ l K R(l) σ plays like σ k with probability ρ k : σ (h 2 T )(i) = k K ρ k σ k (h 2 T )(i) SERGIU HART c 2008 p. 11

45 Regret Matching SERGIU HART c 2008 p. 12

46 Regret Matching Generalizes Hannan s result: SERGIU HART c 2008 p. 12

47 Regret Matching Generalizes Hannan s result: σ i = "play i forever" for each i I SERGIU HART c 2008 p. 12

48 Regret Matching Generalizes Hannan s result: σ i = "play i forever" for each i I Blackwell approachability (with changing payoffs) SERGIU HART c 2008 p. 12

49 Regret Matching SERGIU HART c 2008 p. 13

50 Regret Matching - Countable Theorem. For every COUNTABLE SET K there exists a K-REGRET-MATCHING strategy σ σ K such that SERGIU HART c 2008 p. 13

51 Regret Matching - Countable Theorem. For every COUNTABLE SET K there exists a K-REGRET-MATCHING strategy σ σ K such that ) E σ,η [ v T ] E σk,η [ v T ] O (T 1/4 for every k K, every time T 1, and every oblivious strategy η of Player 2. SERGIU HART c 2008 p. 13

52 Regret Matching - Countable SERGIU HART c 2008 p. 14

53 Regret Matching - Countable The K-REGRET-MATCHING strategy σ σ K for COUNTABLE K = 1, 2,... : SERGIU HART c 2008 p. 14

54 Regret Matching - Countable The K-REGRET-MATCHING strategy σ σ K for COUNTABLE K = 1, 2,... : For each k = 1, 2,... : BLOCK k has n k = k 3 periods SERGIU HART c 2008 p. 14

55 Regret Matching - Countable The K-REGRET-MATCHING strategy σ σ K for COUNTABLE K = 1, 2,... : For each k = 1, 2,... : BLOCK k has n k = k 3 periods During BLOCK k play the regret-matching strategy for the set {σ 1,..., σ k } SERGIU HART c 2008 p. 14

56 Regret Matching - Countable The K-REGRET-MATCHING strategy σ σ K for COUNTABLE K = 1, 2,... : For each k = 1, 2,... : BLOCK k has n k = k 3 periods During BLOCK k play the regret-matching strategy for the set {σ 1,..., σ k } Regrets are computed on the basis of the current block only SERGIU HART c 2008 p. 14

57 Regret Matching SERGIU HART c 2008 p. 15

58 Regret Matching Countable sets of strategies: SERGIU HART c 2008 p. 15

59 Regret Matching Countable sets of strategies: Automata SERGIU HART c 2008 p. 15

60 Regret Matching Countable sets of strategies: Automata Turing Machines SERGIU HART c 2008 p. 15

61 Regret Matching Countable sets of strategies: Automata Turing Machines A COMPUTABLE SET OF STRATEGIES SERGIU HART c 2008 p. 15

62 Regret Matching Countable sets of strategies: Automata Turing Machines A COMPUTABLE SET OF STRATEGIES INPUT: k h t (index of a strategy) (a history) SERGIU HART c 2008 p. 15

63 Regret Matching Countable sets of strategies: Automata Turing Machines A COMPUTABLE SET OF STRATEGIES INPUT: k h t (index of a strategy) (a history) OUTPUT: σ k (h t ) (I) SERGIU HART c 2008 p. 15

64 Regret SERGIU HART c 2008 p. 16

65 Regret: The End SERGIU HART c 2008 p. 16

Rationalizable Strategies

Rationalizable Strategies Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Jun 1st, 2015 C. Hurtado (UIUC - Economics) Game Theory On the Agenda 1