It s All in the Game An Introduction to Game Theory

Size: px

Start display at page:

Download "It s All in the Game An Introduction to Game Theory"

Rolf Cook
5 years ago
Views:

1 It s All in the Game An Introduction to Game Theory

2 Introduction to Game Theory - What is a game? Basic concepts and assumptions Brief early history of game theory - Dominant and dominated strategies - Two-player zero-sum games: the minimax theorem - Two-player non-zero-sum games: The Prisoner s Dilemma

3 What is a Game? - A situation in which entities (players) with opposing interests must select from among different courses of action (strategies) Each strategy leads to a different outcome (payoff) - Applies to many fields: Economics Biology Sociology War Politics Philosophy Recreation and entertainment - The object of game theory is to determine the players best strategies

4 Example: Matching Pennies - Two players each display a penny at the same time - If both pennies match (i.e. both are heads or both are tails), Player 1 wins a penny - If the pennies do not match, Player 2 wins a penny

5 Game matrix for Matching Pennies - Player 1 wins all matches; Player 2 wins all mis-matches Show head Player 2 Show tail Player 1 Show head Show tail (+1, -1) (-1, +1) Match Mis-match (-1, +1) (+1, -1) Mis-match Match - Convention: Each cell shows (Player 1 s payoff, Player 2 s payoff)

6 Observations about Matching Pennies - Both players make their moves at the same time - The payoffs in each cell add up to zero (zero-sum game) - No fixed strategy guarantees a win for either player Fixed strategies for example, showing heads all the time are called pure strategies - What if a player shows heads or tails randomly? This type of strategy is called a mixed strategy If each player plays a 50% mixed strategy of heads/tails (for example, by flipping the coins) then each player will win 50% of the time on the average

7 Another Representation: Extensive Form Player 1 Action A Action B Player 2 Player 2 Action X Action Y Action X Action Y Payoff from (A, X) Payoff from (A, Y) Payoff from (B, X) Payoff from (B, Y) - Extensive form is used when players actions alternate over time so that each player knows the other s prior move(s)

8 Social Contract Theory The State of Nature - Thomas Hobbes: Without a central government the natural state of humanity is a state of war of every individual against every other individual - In this state, individuals make economic decisions based strictly on what they can compel by their own personal force

9 Whatsoever therefore is consequent to a time of Warre, where every man is Enemy to every man; the same is consequent to the time, wherein men live without other security, than what their own strength, and their own invention shall furnish them withall. In such condition, there is no place for Industry; because the fruit thereof is uncertain: and consequently no Culture of the Earth; no Navigation, nor use of the commodities that may be imported by Sea; no commodious Building; no Instruments of moving, and removing such things as require much force; no Knowledge of the face of the Earth; no account of Time; no Arts; no Letters; no Society; and which is worst of all, continuall feare, and danger of violent death; And the life of man, solitary, poore, nasty, brutish, and short. - Thomas Hobbes, Leviathan, 1651

10 Example: The Farmers Dilemma (David Hume, 1740) - Two farmers each have a bumper crop of corn (400 acres) - Each farmer needs the other s help to completely harvest the corn, or a substantial portion (200 acres) will rot in the field - The effort expended to harvest the neighboring farmer s field results in loss of yield of 100 acres of one s own field - Will the farmer whose corn ripens later (Farmer 2) help the farmer whose corn ripens first (Farmer 1)? - Starting from the end of the game tree, backward induction eliminates actions that are not payoff-maximizing at each stage

11 Example: The Farmers Dilemma (David Hume, 1740) Farmer 2 Help Don t help Farmer 1 Farmer 1 Help Don t help Help Don t help (300,300) (100,400) (400,100) (200,200)

12 Example: The Farmers Dilemma (David Hume, 1740) Farmer 2 Help Don t help Farmer 1 Farmer 1 Help Don t help Help x Don t help (300,300) (100,400) (400,100) (200,200)

13 Example: The Farmers Dilemma (David Hume, 1740) Farmer 2 Help Don t help Farmer 1 Farmer 1 Help x Don t help Help x Don t help (300,300) (100,400) (400,100) (200,200)

14 Example: The Farmers Dilemma (David Hume, 1740) Farmer 2 Help Don t help Farmer 1 Farmer 1 Help x Don t help Help x Don t help (300,300) (100,400) (400,100) (200,200)

15 Observations about the Farmers Dilemma - Rational analysis seems to lead to a sub-optimal result If the farmers could somehow trust each other, and cooperate, they would each get a payoff of 300 vs Foreshadows other paradoxes of rationality that are revealed by game theory, notably in the Prisoner s Dilemma

16 Your corn is ripe today; mine will be so tomorrow. Tis profitable for us both, that I shou d labour with you to-day, and that you shou d aid me to-morrow. I have no kindness for you, and know you have as little for me. I will not, therefore, take any pains on your account; and should I labour with you upon my own account, in expectation of a return, I know I shou d be disappointed, and that I shou d in vain depend upon your gratitude. Here then I leave you to labour alone: You treat me in the same manner. The seasons change; and both of us lose our harvests for want of mutual confidence and security. - David Hume

17 Basic assumptions of game theory - Players are rational We wish to find the mathematically complete principles which define rational behavior for the participants in a social economy, and to derive from them the general characteristics of that behavior. (Von Neumann and Morgenstern, Theory of Games and Economic Behavior) - Players are seeking to maximize utility Max utility = most preferred outcome - Players interests are opposed... Their preferences may still coincide - Players strategies comprehensively specify their complete plans of action under all possible circumstances

18 Basic assumptions of game theory - Rationality - Are people really rational in conflict situations? Experiments suggest they often are not - Rationality assumes complete ( perfect ) information All players know all other players, outcomes, payoffs, etc. All players know all preferences of all other players All the above are common knowledge - In many cases people do not actually have perfect information Partial knowledge results in bounded rationality (myopic behavior, poor decisions, etc.) - The assumption of rationality can lead to paradoxes

19 Example: The Cuban Missile Crisis - In October 1962 the U.S. discovered that Russia was building a missile base in Cuba - A Russian fleet was on its way to Cuba to bring in supplies (including missiles with nuclear warheads) - The U.S. set up a naval blockade - Basically the U.S. and Russia were playing a game of Chicken who would back down first? - Suppose that President Kennedy had been notified that his staff was infiltrated by Russian spies and that Premier Khrushchev would immediately learn of all his decisions How great an advantage would this have given Khrushchev?

20 The Cuban Missile Crisis Possible Outcomes and Preferences (4 = most desirable, 1 = least desirable) US US stands firm, Russia backs down US backs down, Russia backs down US backs down, Russia stands firm US stands firm, Russia stands firm Russia US backs down, Russia stands firm US backs down, Russia backs down US stands firm, Russia backs down US stands firm, Russia stands firm

21 Game Matrix for the Cuban Missile Crisis Back down Russia Stand firm US Back down Stand firm (3,3) (2,4) (4,2) (1,1)

22 The Cuban Missile Crisis: Paradox - Foreknowledge of Kennedy s decision would have worked to Khrushchev s disadvantage! Assumes Kennedy was aware of the foreknowledge

23 Game Matrix for the Cuban Missile Crisis Russia Back down Stand firm US Back down Stand firm (3,3) (2,4) (4,2) (1,1) - If Kennedy decides to back down, Khrushchev will know it, and will decide to stand firm (payoff 4 vs. 3: outcome (2,4))

24 Game Matrix for the Cuban Missile Crisis Russia Back down Stand firm US Back down Stand firm (3,3) (2,4) (4,2) (1,1) - If Kennedy decides to stand firm, Khrushchev will know it, and will decide to back down (payoff 2 vs. 1: outcome (4,2))

25 Game Matrix for the Cuban Missile Crisis Russia Back down Stand firm US Back down Stand firm (3,3) (2,4) (4,2) (1,1) - Therefore if Kennedy has to choose between only those two alternatives, he will choose to stand firm (payoff 4 vs. 2), causing Khrushchev to back down (which is what happened)

26 Basic assumptions of game theory - Utility - Utility is supposed to measure players preferences Not necessarily the same as monetary payoff, but could be correlated with it Takes into account altruism, risk tolerance, etc. - Utility is closely related to the definition of rationality Rational decision makers should maximize expected utility - Do people actually try to maximize utility in conflict situations? Experiments suggest they often may not People may make the first decision that works, rather than the best possible decision

Herbert Simon 1916-2001 - American psychologist and decision science theorist, Carnegie- Mellon University; Nobel Prize, 1978 - One of the key founders of the fields of artificial intelligence and

27 Herbert Simon American psychologist and decision science theorist, Carnegie- Mellon University; Nobel Prize, One of the key founders of the fields of artificial intelligence and decision theory - Introduced the concept of bounded rationality - Put forth the key idea that decision makers satisfice instead of actually attempting to maximize utility; that is, they tend to make the first decision that satisfies a set of requirements they are trying to fulfill, instead of the truly optimal decision

28 Utility and Risk Attitudes - Which would you prefer? A lottery ticket that pays out $10 with probability 50% and $0 otherwise, or A lottery ticket that pays out $3 with probability 100%

29 Utility and Risk Attitudes - Which would you prefer? A lottery ticket that pays out $10 with probability 50% and $0 otherwise, or A lottery ticket that pays out $3 with probability 100% - How about: A lottery ticket that pays out $100,000,000 with probability 50% and $0 otherwise, or A lottery ticket that pays out $30,000,000 with probability 100%

30 Utility and Risk Attitudes - Which would you prefer? A lottery ticket that pays out $10 with probability 50% and $0 otherwise, or A lottery ticket that pays out $3 with probability 100% - How about: A lottery ticket that pays out $100,000,000 with probability 50% and $0 otherwise, or A lottery ticket that pays out $30,000,000 with probability 100% - Risk-neutral people only care about the ticket s expected value - Risk-averse (most) people prefer the ticket s expected value to the ticket - Risk-seeking people prefer the ticket to the ticket s expected value

31 Maximizing Expected Utility Utility Buy a used car (utility = 3) Buy a nicer bike (utility = 2) Buy a bike (utility = 1) Risk-averse utility function with diminishing marginal utility $200 $1500 $5000 Money - Risk-averse utility functions flatten out the more money is involved because of diminishing marginal utility Each additional dollar provides less utility than the dollar before it

32 Maximizing Expected Utility Utility Buy a used car (utility = 3) Buy a nicer bike (utility = 2) Buy a bike (utility = 1) Risk-averse utility function with diminishing marginal utility $200 $1500 $5000 Money - Choice 1: get $1500 with probability 100% - Choice 2: Lottery get 60% chance of $200 or a 40% chance of $ Which is preferred with a risk-averse utility function?

33 Maximizing Expected Utility Utility Buy a used car (utility = 3) Buy a nicer bike (utility = 2) Expected utility of Choice 2 = 1.8 Buy a bike (utility = 1) Lottery function Risk-averse utility function with diminishing marginal utility Expected value of Choice 2 = $2120 $200 $1500 $5000 Money - Choice 1: get $1500 with probability 100% Expected utility = 2 - Choice 2: Lottery get 60% chance of $200 or a 40% chance of $5000 Expected utility = 60%(1) + 40%(3) = 1.8 < 2 Expected value = 60%($200) + 40%($5000) = $2120 > $ So: maximizing expected utility is consistent with risk aversion

34 Utility Theory Axioms ( > = is preferred to ) - Completeness: Either A > B, B > A or A = B - Transitivity: If A > B and B > C, then A > C - Independence: If A > B, then a weighted average of A and C > a weighted average of B and C - Continuity: If A > B > C, it should always be possible to find a probability P that makes the individual indifferent between gambling and not gambling in the following situation P% A Gamble 1-P% Don t gamble B C

35 Utility Functions and Rationality - Only risk-neutrality or risk-aversion are rational Risk-aversion must also be independent of size of risk - Transitivity implies that only rigorously consistent preferences are rational - Experiments (Tversky, Kahneman, Thaler) show that some realistic utility functions are not rational Risk-seeking (gambling)... Risk-aversion increasing with ratio of risk to capital/wealth Intransitive: Just because I prefer A to B, and B to C, does not necessarily mean I always prefer A to C - Preferences can depend on changes in the reference points from which gains and losses will occur

Daniel Kahneman 1934- - Israeli psychologist and economist, Hebrew University, University of British Columbia, University of California-Berkeley and Princeton University; Nobel Prize, 2002 - Expert

36 Daniel Kahneman Israeli psychologist and economist, Hebrew University, University of British Columbia, University of California-Berkeley and Princeton University; Nobel Prize, Expert in behavioral economics - With Amos Tversky and their graduate student Richard Thaler, developed prospect theory, which proposes that people do not actually maximize expected utility when making decisions, but rather tend to adjust their risk-aversion to reference points from which gains and losses will occur, causing violations of the transitivity assumption of rationality (inconsistency of preferences)

37 Example: Reference Point Sensitivity (Richard Thaler s experiment) - Option 1: Everyone just won $30: Flip a coin vs. no coin flip Heads: win $9 Tails: lose $9 70% chose the coin flip - Option 2: Everyone starts at $0: Flip a coin vs. $30 for sure Heads: win $39 Tails: win $21 43% chose the coin flip - Participants based their choice on the reference point No actual economic difference between Options 1 and 2

38 Basic assumptions of game theory - Players interests are opposed - If players interests are opposed, can they cooperate? Their preferences can still coincide - Cooperation can arise naturally from opposing interests Communication, signaling, bargaining, negotiation Coalitions (for example, labor unions vs. management) Evolution over time in repeated games (Iterated Prisoner s Dilemma)

39 Basic assumptions of game theory Comprehensive strategies - Is it really possible to formulate comprehensive strategies specifying what actions players will take under all possible circumstances? For some types of games this may not be possible Players may not know all the other players, all possible outcomes, etc. (bounded rationality) It may be impractical to calculate comprehensive strategies even though they may be theoretically possible (for example, chess)

40 Brief early history of game theory - Pascal s Wager - Cournot - Borel - Von Neumann: Minimax theory - Dresher and Flood and Tucker: RAND Corporation experiments and the Prisoner s Dilemma

Blaise Pascal (1623-1662) - French mathematician and philosopher - Believed that nature is actually infinite and that human reason is capable of a nobility and dignity that transcend

41 Blaise Pascal ( ) - French mathematician and philosopher - Believed that nature is actually infinite and that human reason is capable of a nobility and dignity that transcend human finitude - Maintained that although we are incapable of knowing whether or not God exists, we must make a wager on one or the other alternative in determining how we live our lives

42 God is, or He is not. But to which side shall we incline? Reason can decide nothing here. There is an infinite chaos which separated us. A game is being played at the extremity of this infinite distance where heads or tails will turn up... Which will you choose then? Let us see. Since you must choose, let us see which interests you least. You have two things to lose, the true and the good; and two things to stake, your reason and your will, your knowledge and your happiness; and your nature has two things to shun, error and misery. Your reason is no more shocked in choosing one rather than the other, since you must of necessity choose... But your happiness? Let us weigh the gain and the loss in wagering that God is Pascal

43 Game Matrix for Pascal s Wager Reality God exists (p) God does not exist (1-p) You Act as if God exists Act as if God does not exist Eternal reward Eternal punishment Pious life only Earthly Reward only

44 But there is here an infinity of an infinitely happy life to gain, a chance of gain against a finite number of chances of loss, and what you stake is finite. It is all divided; wherever the infinite is and there is not an infinity of chances of loss against that of gain, there is no time to hesitate, you must give all Pascal

45 You Game Matrix for Pascal s Wager Act as if God exists Act as if God does not exist Reality God exists (p) God does not exist (1-p) Eternal reward Eternal punishment Pious life only x x Earthly Reward only - The infinity of the payoffs if God exists makes the payoffs if God does not exist irrelevant - Therefore the strategy of acting as if God exists dominates the strategy of acting as if God does not exist

46 Dominated Strategies - Strategy A dominates strategy B if a player s payoff with strategy A is always greater than or equal to that of strategy B no matter what the other players do - It s often possible to simplify a complex game by eliminating dominated strategies

47 Example: Dominated Strategies Player 2 Strategy 1 Strategy 2 Strategy 3 Strategy A (7,-7) (9,-9) (8,-8) Player 1 Strategy B (9,-9) (10,-10) (12,-12) Strategy C (8,-8) (8,-8) (8,-8)

48 Example: Dominated Strategies Player 2 Strategy 1 Strategy 2 Strategy 3 Strategy A (7,-7) (9,-9) (8,-8) Player 1 Strategy B (9,-9) (10,-10) (12,-12) Strategy C (8,-8) (8,-8) (8,-8) - For Player 1, Strategy B dominates the other two strategies because its payoffs are greater no matter whether Player 2 follows Strategy 1, 2 or 3

49 Example: Dominated Strategies Player 2 Strategy 1 Strategy 2 Strategy 3 Strategy A (7,-7) (9,-9) (8,-8) Player 1 Strategy B (9,-9) (10,-10) (12,-12) Strategy C (8,-8) (8,-8) (8,-8) - For Player 2, Strategy 1 dominates the other two strategies because its negative payoffs are less than or equal to those of Strategies 2 or 3 no matter what strategy Player 1 follows

50 Example: Dominated Strategies Player 2 Strategy 1 Strategy 2 Strategy 3 Strategy A (7,-7) (9,-9) (8,-8) Player 1 Strategy B (9,-9) (10,-10) (12,-12) Strategy C (8,-8) (8,-8) (8,-8) -Therefore Player 1 will play Strategy B, Player 2 will play Strategy 1 and the resulting payoff will be (9,-9)

Antoine Augustin Cournot 1801-1877 - French mathematician and economist - Published his Recherches in 1838, founding the theory of the firm - Based his analysis of the economic

51 Antoine Augustin Cournot French mathematician and economist - Published his Recherches in 1838, founding the theory of the firm - Based his analysis of the economic behavior of firms on a primitive notion of game theory, specifically, the idea of rivals achieving an equilibrium based on each one making its best response to the other s actions

Emile Borel 1871-1956 - French mathematician and politician - First to mathematically define the notion of games of strategy -

52 Emile Borel French mathematician and politician - First to mathematically define the notion of games of strategy - Gave the first modern formulation of a mixed strategy along with a method of finding the best strategy for certain two-person games

John von Neumann 1903-1957 - Jewish-Hungarian-American prodigy (became Catholic in his first marriage); one of the greatest mathematicians of the 20 th century - Made key contributions to many

53 John von Neumann Jewish-Hungarian-American prodigy (became Catholic in his first marriage); one of the greatest mathematicians of the 20 th century - Made key contributions to many fields, including set theory, quantum physics, computer science, and the development of the atomic bomb - Proved the minimax theorem in 1926 in a paper presented to the Gottingen Mathematical Society - With Oskar Morgenstern, published Theory of Games and Economic Behavior in 1944, effectively founding game theory

54 von Neumann s Minimax Theorem - In every finite two-person zero-sum game with perfect information, there exists a strategy for each player which allows both players to minimize their maximum losses (hence the name minimax ) - In a game matrix the minimax payoff and its corresponding strategies are easily recognized if they are pure strategies The minimax payoff is the smallest in its row and the largest in its column when looked at from the point of view of the row player - When a pure minimax strategy exists in a game, that strategy will be an equilibrium for that game Once the players settle on a pair of strategies corresponding to a minimax point, they have no reason to change strategies

55 Example of Minimax Equilibrium: Abortion Issue - The Democrats and Republicans are trying to formulate their strategies about the abortion issue - Each party will hold a convention, determine its platform privately and announce it simultaneously - Polling prior to the conventions indicates the percentage of the vote that each party is likely to get with each possible outcome of platform choices - What strategy should each party adopt? Pro-life? Pro-choice? Or dodge the issue?

56 Game Matrix for Abortion Issue Pro-life Republicans Pro-choice Dodge issue Prolife (35%,65%) (10%,90%) (60%,40%) Democrats Prochoice Dodge issue (45%,55%) (55%,45%) (50%,50%) (40%,60%) (10%,90%) (65%,35%)

57 Game Matrix for Abortion Issue Pro-life Republicans Pro-choice Dodge issue Prolife (35%,65%) (10%,90%) (60%,40%) Democrats Prochoice Dodge issue (45%,55%) (55%,45%) (50%,50%) (40%,60%) (10%,90%) (65%,35%) 45% is the largest value in the column (looked at from the Democrats point of view)

58 Game Matrix for Abortion Issue Pro-life Republicans Pro-choice Dodge issue Prolife (35%,65%) (10%,90%) (60%,40%) Democrats Prochoice Dodge issue (45%,55%) (55%,45%) (50%,50%) (40%,60%) (10%,90%) (65%,35%) 45% is the smallest value in the row

59 Game Matrix for Abortion Issue Pro-life Republicans Pro-choice Dodge issue Prolife (35%,65%) (10%,90%) (60%,40%) Democrats Prochoice Dodge issue (45%,55%) (55%,45%) (50%,50%) (40%,60%) (10%,90%) (65%,35%) 45% is called a saddle point and is the minimax

60 Why Is the Minimax an Equilibrium? - The payoff associated with the saddle point is called the value of the game - By playing the equilibrium strategy (i.e. the strategy that results in the saddle point payoff), each player gets at least the value of the game - By playing the equilibrium strategy, an opponent can stop a player from getting any more than the value of the game - Since the game is zero-sum, a player s opponent wants to minimize that player s payoff - Neither player can gain by changing strategies unilaterally

61 What If There Is No Pure Strategy Minimax? - By Von Neumann s theorem, for a finite two-person zero-sum game with perfect information, a mixed-strategy minimax must exist This means each player selects from among the various pure strategies in the game with some random probability This strategy will give the value of the game on average, rather than with certainty as in a pure strategy minimax - Finding the mixed-strategy minimax is often difficult - One way is to choose a mixed strategy that gives the same average payoff (expected value) whatever the opponent does Expected value = sum (payoff x probability of payoff)

62 Example: The Escaped Convict - A convict escapes from jail - He has two possible routes of escape: The highway or the forest - The sheriff can only cover one route - If they take different routes, escape is certain - If both take the highway, the convict will certainly be caught - If both take the forest, the probability of escape is 1-1/n, where n represents the difficulty of searching in the forest

63 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (0,1) (1,0) Forest (1,0) (1-1/n,1/n)

64 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (0,1) (1,0) Forest (1,0) (1-1/n,1/n) - First consider what happens from the Convict s point of view if the Sheriff takes the Highway

65 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (p) (0,1) 0(p) (1,0) Forest (1-p) + (1,0) (1-1/n,1/n) 1(1-p) = 1-p

66 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (0,1) (1,0) Forest (1,0) (1-1/n,1/n) - Next consider what happens from the Convict s point of view if the Sheriff goes into the Forest

67 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (p) (0,1) (1,0) 1(p) Forest (1-p) + (1,0) (1-1/n,1/n) (1-1/n)(1-p) = p + 1 p 1/n + p/n = 1 1/n + p/n

68 Analysis of The Escaped Convict - If the Sheriff takes the Highway: Convict s average payoff is 1-p - If the Sheriff takes the Forest: Convict s average payoff is 1 1/n + p/n - To find the probability of taking the Highway that gives the same average payoff for the Convict whatever the Sheriff does, set these two payoffs equal to each other: 1-p = 1 1/n + p/n -p = -1/n + p/n -p p/n = -1/n -np p = -1 np + p = 1 p(n + 1) = 1 p = 1/(n+1) = Convict s probability of taking Highway

69 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Highway p = 1/(n+1) Convict (0,1) (1,0) Forest (1,0) (1-1/n,1/n)

70 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest Convict Highway (0,1) (1,0) Forest 1-p = 1-1/(n+1) (1,0) (1-1/n,1/n)

71 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Sheriff Highway (q) Forest (1-q) Convict Highway (0,1) (1,0) 1(q) + 0(1-q) = q Forest (1,0) (1-1/n,1/n) 0(q) + 1/n(1-q) = (1-q)/n - It turns out that the Sheriff s probabilities are the same: q = (1-q)/n; nq = 1-q; (n+1)q = 1; q = 1/(n+1) and 1-q = 1-1/(n+1)

72 Game Matrix for The Escaped Convict (Matrix entries indicate the probability of escape or capture) Highway Sheriff Forest 1-q = 1-1/(n+1) Convict Highway (0,1) (1,0) Forest 1-p = 1-1/(n+1) (1,0) (1-1/n,1/n) - The bigger n is, the more likely the forest route is for both - The bigger n is, the more likely the Convict is to escape

73 An Executive Decision Maker for The Escaped Convict Game The part of the circle representing Forest gets bigger the bigger n gets Forest p, q = 1-1/(n+1) Highway p, q = 1/(n+1) The part of the circle representing Highway gets smaller the bigger n gets

Melvin Dresher 1911-1992 - Polish-American mathematician - Studied problems of equilibrium in non-zero-sum games - With Merrill Flood, ran an experiment at the RAND Corporation in 1950 that

74 Melvin Dresher Polish-American mathematician - Studied problems of equilibrium in non-zero-sum games - With Merrill Flood, ran an experiment at the RAND Corporation in 1950 that resulted in the formulation of the game theoretical model of conflict and cooperation later called the Prisoner s Dilemma by Albert W. Tucker (who was John Nash s Ph.D. thesis advisor at Princeton)

75 Dresher and Flood s RAND Experiment - Dresher and Flood wondered whether real people actually would tend to find the equilibrium point in a game when the game had a potentially better outcome - They ran 100 repetitions of the following game between two players (Armen Alchian of UCLA s economics department and John Williams, head of RAND s mathematics department)

76 Dresher and Flood s RAND Experiment Williams Strategy 1 Strategy 2 Alchian Strategy 1 (-1 cent, 2 cents) (½ cent, 1 cent) Strategy 2 (0, ½ cent) (1 cent, -1 cent)

77 Dresher and Flood s RAND Experiment Williams Strategy 1 Strategy 2 Alchian Strategy 1 (-1 cent, 2 cents) (½ cent, 1 cent) Strategy 2 (0, ½ cent) (1 cent, -1 cent) - For Alchian, Strategy 2 dominates Strategy 1, because his payoffs in Strategy 2 are higher than his payoffs in Strategy 1 no matter what strategy Williams chooses

78 Dresher and Flood s RAND Experiment Williams Strategy 1 Strategy 2 Alchian Strategy 1 (-1 cent, 2 cents) (½ cent, 1 cent) Strategy 2 (0, ½ cent) (1 cent, -1 cent) - For Williams, Strategy 1 dominates Strategy 2, because his payoffs from Strategy 1 are higher than his payoffs from Strategy 2 no matter what strategy Alchian chooses

79 Dresher and Flood s RAND Experiment Williams Strategy 1 Strategy 2 Alchian Strategy 1 (-1 cent, 2 cents) (½ cent, 1 cent) Strategy 2 (0, ½ cent) (1 cent, -1 cent) - (0, ½ cent) is therefore the equilibrium point - But (½ cent,1 cent) is a better outcome for both, even though it is biased in favor of Williams

80 Dresher and Flood s RAND Experiment - The experiment showed no evidence of any instinctive preference for the equilibrium point Chosen only 14 times out of the 100 games - The players appeared to struggle over the course of the experiment to secure mutual cooperation This is apparent from reading the logs of the comments made by the players during the experiment

81 Dresher and Flood s RAND Experiment Excerpts from Log Alchian Williams Game Strategy Strategy Alchian comment Williams comment Williams will play Hope he s bright. Strategy 1 sure win. Hence if I play 1 I lose What is he doing?!! He isn t but maybe he ll wise up Trying mixed? Okay, dope Has he settled on 1? Okay, dope Perverse! It isn t the best of all possible worlds I m sticking to 2 Oh ho! Guess I ll since he will mix for have to give him at least 4 more times. another chance.

82 Dresher and Flood s RAND Experiment Excerpts from Log Alchian Williams Game Strategy Strategy Alchian comment Williams comment The stinker He s crazy. I ll teach him the hard way I m completely Let him suffer. confused. Is he trying to convey information to me? Maybe he ll be a good boy now Always takes time to learn.

83 Albert W. Tucker Canadian-born American mathematician, chair of Princeton University mathematics department - Chaired the AP Calculus committee of the College Board during the 1960 s - Ph.D thesis advisor of John Nash - Restructured the game in Dresher and Flood s experiment and gave it the narrative we know today as the Prisoner s Dilemma

84 The Prisoner s Dilemma - Two criminals collaborate in the commission of a robbery - They are arrested and held separately on a charge of carrying concealed weapons, which carries a one year jail term - The testimony of each one is required in order to convict the other of the robbery, which carries a 20 year jail term - Each prisoner is offered immunity and a suspended sentence if he will turn state s evidence and testify against the other prisoner (but he could still be convicted on the basis of the testimony of the other prisoner!) - If both prisoners confess, they each go to jail for 5 years - If neither prisoner confesses, they each go to jail for 1 year on the concealed weapons charge

85 Game Matrix for the Prisoner s Dilemma Second Prisoner Do not confess Confess ( Cooperate ) ( Defect ) Do not confess ( Cooperate ) First Prisoner Confess ( Defect ) (-1 yr., -1 yr.) (0, -20 yr.) (-20 yr., 0) (-5 yr., -5 yr.)

86 Observations about the Prisoner s Dilemma - This game is not zero-sum (conditions for minimax do not hold) - The minimax strategy for each prisoner is to defect Look at the game from the first prisoner s point of view If the second prisoner defects, our prisoner must either cooperate and go to jail for 20 years, or defect and go to jail for 5 years If the second prisoner cooperates, our prisoner can serve 1 year by cooperating also, or go free by defecting - If the prisoners could somehow both cooperate by not confessing, they would achieve better payoffs (they both would serve 1 year as opposed to 5 years) - Experiments show that people frequently do defect in situations similar to the Prisoner s Dilemma

87 Canonical Prisoner s Dilemma Matrix Cooperate Defect Cooperate (R, R) (S, T) Defect (T, S) (P, P) R = Reward for mutual cooperation S = Sucker s payoff T = Temptation to defect P = Punishment for defection T > R > P > S 2R > T + S

88 Applications of the Prisoner s Dilemma - Nuclear arms race Any one country can defect by breaking an arms treaty - Global climate change Any one country can defect by not adhering to an emissions standard - Steroid use in professional athletics Any one athlete can defect by using performance enhancing drugs - OPEC (or any other economic cartel) Any one country can defect by shipping more oil than its production ceiling allows

89 What if we play the Prisoner s Dilemma repeatedly? (Iterated Prisoner s Dilemma) - If the game is played exactly N times then defect each time is a dominant strategy - Why? Assume the game is played N+1 times Defect on the last play, since no retaliation is possible But then the game reduces to N times etc. - If we don t know how many times the game will be played, it gets a lot more interesting

Robert Axelrod 1943- - Professor of Political Science and Public Policy, University of Michigan - Winner of MacArthur Fellowship -

90 Robert Axelrod Professor of Political Science and Public Policy, University of Michigan - Winner of MacArthur Fellowship - Published seminal work, The Evolution of Cooperation, in 1984, summarizing the results of his study of the Iterated Prisoner s Dilemma

91 Axelrod s Iterated Prisoner s Dilemma Tournament - Different computer programs were run against each other in two pairwise round robin tournaments Each game lasted 200 moves (but this was not told to the program developers before the tournament) - Each program embodied a particular strategy Examples: Always defect ; Always cooperate ; Cooperate until your opponent defects, then always defect ; Co-operate but defect 10% of the time ; etc. - Participants in the second tournament had access to the results of the first one - There was also a random competitor in each tournament Defected or cooperated with 50% probability

92 Axelrod s Iterated Prisoner s Dilemma Tournament Game Matrix Cooperate Defect Cooperate (3, 3) (0, 5) Defect (5, 0) (1, 1) - The winner was the program accumulating the most total points over the entire tournament

93 Axelrod s Iterated Prisoner s Dilemma Tournament The results - A program called Tit for Tat won both tournaments Submitted by Anatol Rapoport of U. of Toronto Started by cooperating Then did whatever its opponent did on the prior move Strangely like the Golden Rule plus lex talionis - Other programs that did well also started by cooperating If they started by cooperating, many continued doing so These programs did well because they did well with each other and because there were enough of them to raise each other s average score - Greedy or selfish programs didn t do as well They tended to be rapidly punished by counter-defection

94 Axelrod s observations about successful Iterated Prisoner s Dilemma strategies - Nice strategies finish first Nice = not being the first to defect - Good strategies are retaliatory Punish defection immediately, no matter how cooperative the interaction has been so far - Good strategies are forgiving Do not continue punishing defection for more than one move - Good strategies are clear... Other strategies easily recognize and adjust to them None of the more complex programs performed as well as the simple Tit For Tat

95 Implications of the Iterated Prisoner s Dilemma - Biology: Suppose programs reproduce or die out according to their scores Even strategies that do poorly can affect which strategies do best Tit for Tat and programs like it end up dominating in population simulations Suggests that cooperation may evolve in a world of competing entities - Philosophy: Suggests that moral principles underlying cooperation (modeled in a very primitive fashion by Axelrod s observations about successful strategies) could result from evolution - Politics and law: Suggests a basis for social contract theory

97 Normal Form Representation: Game Matrix Action X Player 2 Action Y Player 1 Action A Action A B Payoff from (A, X) Payoff from (A, Y) Payoff from (B, X) Payoff from (B, Y) - Convention: Each cell shows (Player 1 s payoff, Player 2 s payoff)

98 The Cuban Missile Crisis: Paradox - Khrushchev s foreknowledge of Kennedy s decision would have worked to Khrushchev s disadvantage, assuming that Kennedy was aware of Khrushchev s foreknowledge! - Why? Reason as follows If Kennedy decides to back down, Khrushchev will know it, and will decide to stand firm: payoff (2,4) (vs. backing down with payoff (3,3)) If Kennedy decides to stand firm, Khrushchev will know it, and will decide to back down: payoff (4,2) (vs. standing firm with payoff (1,1)) - Thus if Kennedy backs down the outcome will be (2,4), whereas if he stands firm the outcome will be (4,2); he would therefore choose to stand firm, causing Khrushchev to back down (which is what actually transpired)

99 A Utility Function U(W), E(W) Expected value function E[pW 1 +(1-p)W 2 ] U[pW 1 +(1-p)W 2 ] W = amount of money p = probability (between 0 and 1) U = utility E = expected value W 1 W 2 Wager: pw 1 +(1-p)W 2 Utility function - The utility of the expected value is less than the expected value, so this is a risk-averse utility function W

100 A Utility Function U(W), E(W) Expected value function E[pW 1 +(1-p)W 2 ] = 150 U[pW 1 +(1-p)W 2 ] = 140 Utility function W = amount of money p = probability (between 0 and 1) = 50% U = utility E = expected value W 1 =100 W 2 =200 Wager: 50%(100) + 50%(200) - The utility of the expected value is less than the expected value, so this is a risk-averse utility function W

101 Another Utility Function U[pW 1 +(1-p)W 2 ] E[pW 1 +(1-p)W 2 ] W = amount of money p = probability (between 0 and 1) U = utility E = expected value U(W), E(W) Utility function W 1 W 2 Expected value function Wager: pw 1 +(1-p)W 2 - The utility of the expected value is greater than the expected value, so this is a risk-seeking utility function W

102 A Utility Function U(W), E(W) Utility function U[pW 1 +(1-p)W 2 ] = 160 E[pW 1 +(1-p)W 2 ] = 150 W = amount of money p = probability (between 0 and 1) = 50% U = utility E = expected value Expected value function Wager: 50%(100)+50%(200) - The utility of the expected value is greater than the expected value, so this is a risk-seeking utility function W

103 Basic assumptions of game theory - Utility Four axioms of utility theory define a rational decision maker: - Completeness: For every A and B, either A>B, A=B or A<B: Either A is preferred to B, as good as B or worse than B - Transitivity: For every A, B and C, if A>B and B>C then A>C: If A is preferred to B and B is preferred to C, then A is always preferred to C - Independence: For every set of gambles A, B and C where A>B, there should be some weighting factor 0<w<1 such that wa+(1-w)c > wb+(1-w)c: If A is preferred to B, then the weighted average of A and C should be preferred to the weighted average of B and C (for at least some weight) - Continuity: For every set of gambles A, B and C where A>B>C, there must be a probability p such that B = pa + (1-p)C: If B is ranked between A and C, there must be a possible combination of A and C that makes the individual indifferent between that combination and B (otherwise, it would not be logical for B to be ranked between A and C)

104 Prospect Theory Utility Function

105 Example: Matching Pennies - Two players each display a penny at the same time - If both pennies match (i.e. both are heads or both are tails), Player 1 wins a penny - If the pennies do not match, Player 2 wins a penny

106 Observations about Pascal s Wager - The strategy of acting as if God exists dominates the strategy of acting as if God does not exist The infinite magnitude of the punishment in the case that God actually exists means that even if the probability (p) of that case is minuscule, the risk is still too great The infinite magnitude of the reward in the case that God actually exists means that even if the probability (p) of that case is minuscule, the reward is still very great Therefore the infinity of the payoffs following from God s existence makes the payoffs following from God s nonexistence irrelevant

107 Game matrix for Matching Pennies Show head Player 2 Show tail Player 1 Show head Show tail (+1, -1) (-1, +1) (-1, +1) (+1, -1)

108 Game matrix for Matching Pennies Player 2 Show head (q) Show tail (1-q) Player 1 Show head (p) Show tail (1-p) (+1, -1) (-1, +1) (-1, +1) (+1, -1)

109 Game matrix for Matching Pennies Player 2 Show head (q) Show tail (1-q) Player 1 Show head Show tail (+1, -1) (-1, +1) q q = 2q-1 (-1, +1) (+1, -1) - Expected value to Player 2 of Player 1 showing heads = 2q-1

110 Game matrix for Matching Pennies Player 2 Show head (q) Show tail (1-q) Player 1 Show head Show tail (+1, -1) (-1, +1) (-1, +1) (+1, -1) -q q = 1-2q - Expected value to Player 2 of Player 1 showing tails = 1-2q

111 Game matrix for Matching Pennies Player 2 Show head (q) Show tail (1-q) Player 1 Show head Show tail (+1, -1) (-1, +1) q q = 2q-1 (-1, +1) (+1, -1) -q q = 1-2q - Player 2 will randomize if 2q-1 = 1-2q; 4q = 2; q = ½, 1-q = ½ - Therefore Player 2 should show heads ½ the time and tails ½ the time, randomly

112 Game matrix for Matching Pennies Show head Player 2 Show tail Player 1 Show head (p) Show tail (1-p) -p p = 1-2p (+1, -1) p (-1, +1) (-1, +1) (+1, -1) -1 + p - Similarly, Player 1 will randomize if 1-2p = 2p-1; 4p = 2; p = ½ - Therefore Player 1 should show heads ½ the time and tails ½ the time, randomly + = 2p-1

113 Axelrod s Iterated Prisoner s Dilemma Tournament The results - Example: Tit For Tat vs. Joss Joss: Similar to Tit For Tat, but defects randomly (10% of the time) after the other player cooperates In this sequence Joss randomly defects on the 6 th move Tit For Tat then defects back on the 7 th move even though Joss returns to cooperating Joss then defects back in response on the 8 th move This results in an echo effect -- Joss defects on all the later even numbered moves and Tit For Tat defects on all the later odd numbered moves On the 25 th move Joss randomly defects again, causing Tit for Tat to defect back, and another echo begins, causing both programs to defect on every move

114 Axelrod s Iterated Prisoner s Dilemma Tournament Tit For Tat vs. Joss Moves Results = both cooperated 3 = Joss only cooperated 2 = Tit for Tat only cooperated 4 = both defected Final score: Tit For Tat 236, Joss 241

115 Tit For Tat Always Cooperate Spiteful Bully (Defect until opponent defects back, then cooperate unless opponent defects three times in a row)

116

117 Collectively stable strategies in the Iterated Prisoner s Dilemma - A strategy is collectively stable if in a population of entities using it, no other strategy can successfully invade and establish itself The new strategy would have to get a higher score against the native strategy than the native strategy gets against another copy of the native strategy

IV. Cooperation & Competition

IV. Cooperation & Competition Game Theory and the Iterated Prisoner s Dilemma 10/15/03 1 The Rudiments of Game Theory 10/15/03 2 Leibniz on Game Theory Games combining chance and skill give the best representation