Heckmeck am Bratwurmeck or How to grill the maximum number of worms

Similar documents
Definition 4.1. In a stochastic process T is called a stopping time if you can tell when it happens.

Using the Maximin Principle

MA 1125 Lecture 14 - Expected Values. Wednesday, October 4, Objectives: Introduce expected values.

Part 1 In which we meet the law of averages. The Law of Averages. The Expected Value & The Standard Error. Where Are We Going?

4.1 Probability Distributions

Math 180A. Lecture 5 Wednesday April 7 th. Geometric distribution. The geometric distribution function is

Chapter 4 and 5 Note Guide: Probability Distributions

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Simple Random Sample

Expectation Exercises.

Probability. An intro for calculus students P= Figure 1: A normal integral

What do you think "Binomial" involves?

NMAI059 Probability and Statistics Exercise assignments and supplementary examples October 21, 2017

Yao s Minimax Principle

Binomial and multinomial distribution

AP Statistics Section 6.1 Day 1 Multiple Choice Practice. a) a random variable. b) a parameter. c) biased. d) a random sample. e) a statistic.

Probability Distributions: Discrete

MATH1215: Mathematical Thinking Sec. 08 Spring Worksheet 9: Solution. x P(x)

King s College London

Chapter 6: Random Variables. Ch. 6-3: Binomial and Geometric Random Variables

2 The binomial pricing model

The Binomial Lattice Model for Stocks: Introduction to Option Pricing

1.1 Interest rates Time value of money

the number of correct answers on question i. (Note that the only possible values of X i

TRINITY COLLGE DUBLIN

OCR Statistics 1. Discrete random variables. Section 2: The binomial and geometric distributions. When to use the binomial distribution

Symmetric Game. In animal behaviour a typical realization involves two parents balancing their individual investment in the common

CHAPTER 4 DISCRETE PROBABILITY DISTRIBUTIONS

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Chapter 4 Discrete Random variables

Decision Trees: Booths

Experimental Probability - probability measured by performing an experiment for a number of n trials and recording the number of outcomes

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Probability Distributions. Definitions Discrete vs. Continuous Mean and Standard Deviation TI 83/84 Calculator Binomial Distribution

Monte Carlo Methods in Structuring and Derivatives Pricing

TABLE OF CONTENTS - VOLUME 2

STAT 3090 Test 2 - Version B Fall Student s Printed Name: PLEASE READ DIRECTIONS!!!!

Random variables. Discrete random variables. Continuous random variables.

TUFTS UNIVERSITY DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING ES 152 ENGINEERING SYSTEMS Spring Lesson 16 Introduction to Game Theory

Numerical Methods in Option Pricing (Part III)

Probability Distributions

King s College London

additionalmathematicsadditionalmath ematicsadditionalmathematicsadditio nalmathematicsadditionalmathematic sadditionalmathematicsadditionalmat

Example Exam Wiskunde A

Math 14 Lecture Notes Ch. 4.3

Discrete Mathematics for CS Spring 2008 David Wagner Final Exam

APPROXIMATING FREE EXERCISE BOUNDARIES FOR AMERICAN-STYLE OPTIONS USING SIMULATION AND OPTIMIZATION. Barry R. Cobb John M. Charnes

VIDEO 1. A random variable is a quantity whose value depends on chance, for example, the outcome when a die is rolled.

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Binomial formulas: The binomial coefficient is the number of ways of arranging k successes among n observations.

Ex 1) Suppose a license plate can have any three letters followed by any four digits.

Chaos and Game Theory:

Chapter 4 Discrete Random variables

Introduction to Dynamic Programming

12 Math Chapter Review April 16 th, Multiple Choice Identify the choice that best completes the statement or answers the question.

Multinomial Coefficient : A Generalization of the Binomial Coefficient

We use probability distributions to represent the distribution of a discrete random variable.

A useful modeling tricks.

MAT 4250: Lecture 1 Eric Chung

Problem Set 2: Answers

Mean, Variance, and Expectation. Mean

Chapter 3 Discrete Random Variables and Probability Distributions

Binomial Distributions

X i = 124 MARTINGALES

CS 343: Artificial Intelligence

Chapter 5: Discrete Probability Distributions

Learning Goals: * Determining the expected value from a probability distribution. * Applying the expected value formula to solve problems.

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

Assignment 3 - Statistics. n n! (n r)!r! n = 1,2,3,...

Binomial Random Variables. Binomial Random Variables

Review. What is the probability of throwing two 6s in a row with a fair die? a) b) c) d) 0.333

Strategy -1- Strategy

Binomal and Geometric Distributions

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

2. Modeling Uncertainty

Name: Show all your work! Mathematical Concepts Joysheet 1 MAT 117, Spring 2012 D. Ivanšić

Computational Finance Least Squares Monte Carlo

SECTION 4.4: Expected Value

Chapter 7. Random Variables

5.2 Random Variables, Probability Histograms and Probability Distributions

Lesson 97 - Binomial Distributions IBHL2 - SANTOWSKI

Opening Exercise: Lesson 91 - Binomial Distributions IBHL2 - SANTOWSKI

Statistical Methods in Practice STAT/MATH 3379

Example 1: Identify the following random variables as discrete or continuous: a) Weight of a package. b) Number of students in a first-grade classroom

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class

Theory of Consumer Behavior First, we need to define the agents' goals and limitations (if any) in their ability to achieve those goals.

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

Chapter 11. Data Descriptions and Probability Distributions. Section 4 Bernoulli Trials and Binomial Distribution

Quantitative Trading System For The E-mini S&P

STA 6166 Fall 2007 Web-based Course. Notes 10: Probability Models

STAT 201 Chapter 6. Distribution

STA 220H1F LEC0201. Week 7: More Probability: Discrete Random Variables

- 1 - **** d(lns) = (µ (1/2)σ 2 )dt + σdw t

(Practice Version) Midterm Exam 1

A hybrid approach to valuing American barrier and Parisian options

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017

Chapter 8 Additional Probability Topics

Name: Show all your work! Mathematical Concepts Joysheet 1 MAT 117, Spring 2013 D. Ivanšić

Unit 04 Review. Probability Rules

Chapter Chapter 6. Modeling Random Events: The Normal and Binomial Models

Transcription:

Heckmeck am Bratwurmeck or How to grill the maximum number of worms Roland C. Seydel 24/05/22 (1) Heckmeck am Bratwurmeck 24/05/22 1 / 29

Overview 1 Introducing the dice game The basic rules Understanding the strategy 2 Computing the optimal strategy Value function and state space Computing the value function 3 The results (2) Heckmeck am Bratwurmeck 24/05/22 2 / 29

Overview Introducing the dice game 1 Introducing the dice game The basic rules Understanding the strategy 2 Computing the optimal strategy Value function and state space Computing the value function 3 The results (3) Heckmeck am Bratwurmeck 24/05/22 3 / 29

Introducing the dice game The basic rules The rules (I) Goal of the game: Get the most worms by throwing high (spot) numbers. 1 There are eight dice with six sides each, going from 1 to 6, where instead of a 6 a worm is shown. 2 You can throw the dice as often as you want; after each throw you choose a number which you haven t chosen yet and put aside all the dice showing this number. 3 The numbers of the dice put aside are added, where a worm (6) counts as a 5. No 6 put aside 0 points! You can take a piece from the shelf with number your points, and put it on top of your pile. (4) Heckmeck am Bratwurmeck 24/05/22 4 / 29

Introducing the dice game The basic rules The rules (II) The winner is the player with the most worms in his pile. Several additional rules which matter to us: Not able to put aside a new number? 0 points! You can also take a piece from the top of your coplayers piles, if you match the number exactly. If your points are not sufficient to pick a piece, you lose the piece on top of your own pile. Rules we ignore for now: Lost pieces are put back on the shelf; the piece with the largest available number is removed from the shelf. Your coplayers can also take pieces from you when it s their turn! The game is over when the shelf is empty. (5) Heckmeck am Bratwurmeck 24/05/22 5 / 29

Example game Introducing the dice game The basic rules (6) Heckmeck am Bratwurmeck 24/05/22 6 / 29

Introducing the dice game Understanding the strategy Optimal strategy: Easy to tell Is selecting four 1 a good idea? Are two 5 better, or two 6? Are two 4 better, or two 5? If I could lose four worms, should I bet or rather risk nothing? (7) Heckmeck am Bratwurmeck 24/05/22 7 / 29

Introducing the dice game Understanding the strategy Optimal strategy: Difficult to tell Are three 5 better, or two 6? And what about four 5? Could it be optimal to select one or more 3 in the first throw? Should I take the five 5, or is it too dangerous? What is the expected / most likely outcome in terms of worms? Should I stop with three 5 and two 6, or continue? (8) Heckmeck am Bratwurmeck 24/05/22 8 / 29

Introducing the dice game Understanding the strategy Parallels to option pricing Early exercise: The player can decide when to stop and exercise. American option Optimal control: At each point in time, a decision has to be taken. Swing option Knock out: If there is no 6, or you are not able to put aside a new number, our points are 0. Barrier option Conclusion Compute the optimal solution by option pricing methods? (9) Heckmeck am Bratwurmeck 24/05/22 9 / 29

Overview Computing the optimal strategy 1 Introducing the dice game The basic rules Understanding the strategy 2 Computing the optimal strategy Value function and state space Computing the value function 3 The results (10) Heckmeck am Bratwurmeck 24/05/22 10 / 29

Computing the optimal strategy Differences to option pricing Option pricing Continuous state space Continuous time Typically 1d state space Same state space in time Knockout happens at barrier line Random noise is added to process No decision needed Heckmeck Discrete state space Only up to 8 times Up to 8d state space Changing state space No clear line for knockout Random is not added to state Each throw needs a decision (control) (11) Heckmeck am Bratwurmeck 24/05/22 11 / 29

Assumptions Computing the optimal strategy 1 The player wants to maximize the expected number of worms in his turn 2 Reactions of other players are not anticipated (single-player optimization) 3 Not the total outcome of the game is optimized, but one single turn of up to 6 throws! (12) Heckmeck am Bratwurmeck 24/05/22 12 / 29

Computing the optimal strategy Value function and state space Value function Situation An array of pieces available on the shelf, on other players piles and on your own pile is called a situation. State space A vector x {1,..., 6} t of already selected dice at time t {0,..., 8} is called the state at time t. The state space for time t is the ensemble of all possible selections with x {1,..., 6} t. We consider only sorted states with notation x = {...}. Value function For a particular situation, the value function v(x) is the expected number of worms (on pieces), assuming that starting from a state of x the expectation-optimal decisions will be taken. Caveat: From now on, worms are only worms on pieces, not on dice! (13) Heckmeck am Bratwurmeck 24/05/22 13 / 29

Computing the optimal strategy Value function: payoff Value function and state space (Worm) payoff For a particular situation let w(n) be the number of worms you would get or lose for a sum of points n. Then the (worm) payoff p(x) of a state x is defined by the number of worms you would get or lose upon termination. In formulas, ({ ) i p(x) = w(n(x)) = w min(x i, 5) 6 x. 0 else Each situation has a worst payoff w(0), equal to the negative number of worms on top of your pile. Examples: p({5, 5, 5, 5, 5, 5, 5, 5}) = w(0) {...} sorted vector! p({1, 2, 3, 4, 5, 6, 6, 6}) = 3 p({1, 5, 5, 5, 6}) = 1 (14) Heckmeck am Bratwurmeck 24/05/22 14 / 29

Computing the optimal strategy Value function and state space Value function on intermediate states Intermediate state For a state x {1,..., 6} t and a throw y {1,..., 6} 8 t, the tuple (x, y) is called intermediate state, i.e., a state that still needs a decision. We can also define the value function of intermediate states (x, y): Either there is a valid best choice ỹ y (in particular ỹ x = ) such that v((x, y)) = v({x, ỹ}), or there is no valid choice, in which case v((x, y)) = w(0) (worst payoff). Conclusion: It is sufficient to compute only the value function on normal states! (15) Heckmeck am Bratwurmeck 24/05/22 15 / 29

Computing the optimal strategy Value function and state space Value function: optimal exercise For states in {1,..., 6} 8, the value function is equal to the payoff (assuming full shelf), e.g.: v({5, 5, 5, 5, 5, 5, 5, 5}) = p({5, 5, 5, 5, 5, 5, 5, 5}) = w(0) v({1, 2, 3, 4, 5, 6, 6, 6}) = p({1, 2, 3, 4, 5, 6, 6, 6}) = 3 We call this the terminal condition. There are other types of states for which the value function is determined by the payoff function: Optimal exercise It is optimal to exercise in a state x {1,..., 6} t if v(x) = p(x), i.e., the value function equals the payoff function. In this case the expected optimal number of worms can be obtained immediately. Yet we do not know the value function yet... (16) Heckmeck am Bratwurmeck 24/05/22 16 / 29

Computing the optimal strategy State space in time t: example Value function and state space Throw dice (random) Decide x = {5, 5, 5} x = {4, 5, 5, 5} x = {1, 5, 5, 5} y = {1, 3, 3, 4, 5} y = {5, 5, 5, 6, 6} y = {5, 5, 5, 5, 5} x = {3, 3, 5, 5, 5} x = {5, 5, 5, 6, 6} Exercise? Knock out! t=3 t=4 t=5 (17) Heckmeck am Bratwurmeck 24/05/22 17 / 29

Computing the optimal strategy Computing the value function Finding the value function: Possible approaches Monte Carlo simulation? This is the approach implicitly chosen by experienced players: Simulate dice throws on the computer Try different strategies in different (simulated) games. Problem: Causality difficult to establish because of multitude of possible strategies Need many simulations! Use recursion programming principle? Solution fastest to implement: Uses { max {p(x), E[max y Y v((x, y))]} x < 8 v(x) = (1) p(x) x = 8 where Y follows a multinominal (dice) distribution One command v({}) starts the whole calculation recursively Problem: Takes an eternity, because most states are computed multiple times Backward induction! Start at terminal time and compute p(x), then go backwards in time using (1). Compute each state only once (18) Heckmeck am Bratwurmeck 24/05/22 18 / 29

Computing the optimal strategy Computing the value function Needed: The multinomial distribution Q: If you toss a coin 5 times, what is the probability of getting 5 4 heads and 4 tails? A: Binomial distribution! 5! (5 4)! 4! 0.55 4 (1 0.5) 4 Multinomial (dice) distribution If you throw a k-dimensional dice n times, then the probability of getting x i times the spot number i for i = 1,..., k is f M (x) = where k i=1 p i = 1 and n i=1 x i = n. n! x 1!... x k! px 1 1... px k k (19) Heckmeck am Bratwurmeck 24/05/22 19 / 29

Backward induction Computing the optimal strategy Computing the value function The backward induction makes the implicit equation (1) explicit by making sure that the values on the right hand side are already computed: Algorithm: Backward induction 1 At time t = 8, determine terminal payoff p(x) for all possible states x {1,..., 6} 8 2 Go backwards in time t t 1, for each t-state x do: 1 Compute distribution of possible dice scenarios 2 Check for each scenario whether knocked out (no new spot numbers), or else to which future state the scenario could lead 3 Take the expectation E Y [max y Y v((x, y))] with v from future times 4 Take the maximum with p(x) (exercise instead of throwing dice) MATLAB implementation runs in just a few seconds on a normal PC! (20) Heckmeck am Bratwurmeck 24/05/22 20 / 29

Overview The results 1 Introducing the dice game The basic rules Understanding the strategy 2 Computing the optimal strategy Value function and state space Computing the value function 3 The results (21) Heckmeck am Bratwurmeck 24/05/22 21 / 29

Figure: Contour lines of value function in initial throw, if shelf starts at 21; dependent on the number of selected dice (vertical axis) of a particular spot (22) Heckmeck am Bratwurmeck 24/05/22 22 / 29 The results Contour lines of value function, full shelf, initial throw 8 Heckmeck initial throw 3 7 0.5 1 0.5 1.5 2 2.5 4 6 3.5 #dice selected 5 4 0.5 1 1.5 2 2.5 3 3 0.5 1 1.5 2 2 1 1 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 spot number 1.5

Explanation: Left almost no values in [5, 20] because for these dice sums knockout is very likely. (23) Heckmeck am Bratwurmeck 24/05/22 23 / 29 Distribution of points The results 0.5 Dice sum distribution for full shelf 0.5 Dice sum distribution for shelf starting at 30 0.45 0.45 0.4 0.4 0.35 0.35 Probability 0.3 0.25 0.2 Probability 0.3 0.25 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 5 10 15 20 25 30 35 40 Dice sum 0 0 5 10 15 20 25 30 35 40 Dice sum Figure: Distribution of dice sum under optimal strategy, from a Monte Carlo simulation with 1000 paths. Left: Full shelf, right: shelf starting at 30

Most likely selections The results What is the most likely initial selection of dice, assuming we act optimally? Answer: two 6 Number of dice Spots 1 2 3 4 5 6 7 1 0.0003 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0 2 0.09 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0 3 0.13 0.0507 0.0009 0.0003 0.00 0.0000 0.0000 0.0 4 0.0724 0.13 0.0531 0.0206 0.0037 0.0004 0.0000 0.0 5 0.0037 0.1666 0.0982 0.0260 0.0042 0.0004 0.0000 0.0 6 0.50 0.2347 0.1035 0.0260 0.0042 0.0004 0.0000 0.0 Table: Probability that a number of dice (horizontal axis) carrying a certain number of spots (vertical axis) is selected under the optimal strategy. (24) Heckmeck am Bratwurmeck 24/05/22 24 / 29

The results Optimal exercise decisions: Example Original question: Stop with three 5 and two 6? (25) Heckmeck am Bratwurmeck 24/05/22 25 / 29

The results Optimal exercise decisions: Example Original question: Stop with three 5 and two 6? >> [v,a,a_inv] = heckmeck_v4(21:36, [], []); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2.6277 (25) Heckmeck am Bratwurmeck 24/05/22 25 / 29

The results Optimal exercise decisions: Example Original question: Stop with three 5 and two 6? >> [v,a,a_inv] = heckmeck_v4(21:36, [], []); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2.6277 >> [v,a,a_inv] = heckmeck_v4([21:31,33:36], [], [32]); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2.4352 (25) Heckmeck am Bratwurmeck 24/05/22 25 / 29

The results Optimal exercise decisions: Example Original question: Stop with three 5 and two 6? >> [v,a,a_inv] = heckmeck_v4(21:36, [], []); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2.6277 >> [v,a,a_inv] = heckmeck_v4([21:31,33:36], [], [32]); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2.4352 >> [v,a,a_inv] = heckmeck_v4([24,33,34], [21, 25,29], [32]); >> heckmeck_value([5,5,5,6,6],a_inv,v) 2 (25) Heckmeck am Bratwurmeck 24/05/22 25 / 29

Figure: Convergence of mean of Monte Carlo simulation depending on number of paths (blue line) in initial throw, if shelf starts at 21. Green: backward induction (26) Heckmeck am Bratwurmeck 24/05/22 26 / 29 The results Optimal strategy in Monte Carlo simulation 1.9 1.85 MC simulation vs. backward induction for initial throw, full shelf Mean of MC sim Value function (backward induction) Value fct +/ stdev 1.8 Number of worms 1.75 1.7 1.65 1.6 1.55 1.5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Number of paths

The results Optimal strategy in Monte Carlo simulation (2) In the whole game, what is our probability of winning if others pursue average strategies? Test: Optimal strategy vs. fuzzy strategy We test our results in a two-player game: Player 1 follows the optimal strategy (derived from the value function v) Player 2 derives his strategy from a value function misestimated by ±0.1 worms (by randomly perturbing v with stdev of 0.1) Result: Player 1 wins 17 out of 20 games! Conclusion: Even a slight difference in optimality makes us win most of the games (law of large numbers because of many rounds per game!) (27) Heckmeck am Bratwurmeck 24/05/22 27 / 29

The results Extensions / References Possible extensions: Incorporate risk in pricing? Utility functions Compute optimal strategies for the whole game Reference: Reiner Knizia: Heckmeck am Bratwurmeck (Pickomino), Zoch-Verlag 2005 (28) Heckmeck am Bratwurmeck 24/05/22 28 / 29

Value function dependent on shelf, initial throw Expected optimal worms 3.5 3 2.5 2 1.5 1 Value depending on shelf die=4 #selected=1 die=4 #selected=2 die=4 #selected=3 die=4 #selected=4 die=4 #selected=5 die=5 #selected=1 die=5 #selected=2 die=5 #selected=3 die=5 #selected=4 die=5 #selected=5 die=6 #selected=1 die=6 #selected=2 die=6 #selected=3 die=6 #selected=4 die=6 #selected=5 0.5 0 20 22 24 26 28 30 32 34 36 Shelf starts at... Figure: Expected number of worms under optimal strategy for different initial dice choices, dependent on the minimal number available on the shelf. (29) Heckmeck am Bratwurmeck 24/05/22 29 / 29