Scenario Generation and Sampling Methods

Similar documents
Risk aversion in multi-stage stochastic programming: a modeling and algorithmic perspective

Strategies for Improving the Efficiency of Monte-Carlo Methods

CPSC 540: Machine Learning

CPSC 540: Machine Learning

On Complexity of Multistage Stochastic Programs

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Universal Portfolios

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Applications of Linear Programming

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives

Stochastic Optimization

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Multistage Stochastic Demand-side Management for Price-Making Major Consumers of Electricity in a Co-optimized Energy and Reserve Market

STAT/MATH 395 PROBABILITY II

Chapter 5. Statistical inference for Parametric Models

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Using Monte Carlo Integration and Control Variates to Estimate π

Stochastic Dual Dynamic Programming

c 2014 CHUAN XU ALL RIGHTS RESERVED

Robust Dual Dynamic Programming

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE

1 Rare event simulation and importance sampling

PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA

Chapter 8: Sampling distributions of estimators Sections

CSCI 1951-G Optimization Methods in Finance Part 07: Portfolio Optimization

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

Chapter 7: Estimation Sections

Homework Assignments

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Forecast Horizons for Production Planning with Stochastic Demand

Sampling and sampling distribution

Equity correlations implied by index options: estimation and model uncertainty analysis

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

ECE 295: Lecture 03 Estimation and Confidence Interval

Asymptotic results discrete time martingales and stochastic algorithms

CS 188: Artificial Intelligence

Optimal routing and placement of orders in limit order markets

Interpolation. 1 What is interpolation? 2 Why are we interested in this?

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Stochastic Optimal Control

CS 188: Artificial Intelligence

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems

18.440: Lecture 32 Strong law of large numbers and Jensen s inequality

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

CH 5 Normal Probability Distributions Properties of the Normal Distribution

Chapter 7: Estimation Sections

Chapter 5. Sampling Distributions

Lecture 10. Ski Jacket Case Profit calculation Spreadsheet simulation Analysis of results Summary and Preparation for next class

Definition 4.1. In a stochastic process T is called a stopping time if you can tell when it happens.

Characterization of the Optimum

Monte Carlo Methods in Option Pricing. UiO-STK4510 Autumn 2015

12 The Bootstrap and why it works

Stochastic Programming: introduction and examples

1 Consumption and saving under uncertainty

TDT4171 Artificial Intelligence Methods

Numerical Simulation of Stochastic Differential Equations: Lecture 2, Part 2

Optimal Security Liquidation Algorithms

CS364B: Frontiers in Mechanism Design Lecture #18: Multi-Parameter Revenue-Maximization

3. The Dynamic Programming Algorithm (cont d)

Advanced Operations Research Prof. G. Srinivasan Dept of Management Studies Indian Institute of Technology, Madras

Optimization in Finance

Simulation Wrap-up, Statistics COS 323

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Point Estimation. Edwin Leuven

Strategies and Nash Equilibrium. A Whirlwind Tour of Game Theory

Lecture outline W.B. Powell 1

Stochastic Programming and Financial Analysis IE447. Midterm Review. Dr. Ted Ralphs

Partial Fractions. A rational function is a fraction in which both the numerator and denominator are polynomials. For example, f ( x) = 4, g( x) =

Integer Programming Models

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Limit Theorems for the Empirical Distribution Function of Scaled Increments of Itô Semimartingales at high frequencies

6.896 Topics in Algorithmic Game Theory February 10, Lecture 3

Problem Set 2: Answers

MTH6154 Financial Mathematics I Stochastic Interest Rates

6. Continous Distributions

IE 495 Lecture 11. The LShaped Method. Prof. Jeff Linderoth. February 19, February 19, 2003 Stochastic Programming Lecture 11 Slide 1

Chapter 7: Point Estimation and Sampling Distributions

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

I. Time Series and Stochastic Processes

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints

Mixed Strategies. In the previous chapters we restricted players to using pure strategies and we

1 Shapley-Shubik Model

Stochastic Programming Modeling

Performance of Stochastic Programming Solutions

Yield Management. Decision Models

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee

Penalty Functions. The Premise Quadratic Loss Problems and Solutions

Introduction to modeling using stochastic programming. Andy Philpott The University of Auckland

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

The Value of Stochastic Modeling in Two-Stage Stochastic Programs

Auction Theory: Some Basics

Option Pricing. Chapter Discrete Time

Probability. An intro for calculus students P= Figure 1: A normal integral

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

COMP331/557. Chapter 6: Optimisation in Finance: Cash-Flow. (Cornuejols & Tütüncü, Chapter 3)

Transcription:

Scenario Generation and Sampling Methods Güzin Bayraksan Tito Homem-de-Mello SVAN 2016 IMPA May 9th, 2016 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 1 / 30

Introduction The team: Gu zin Bayraksan Integrated Systems Engineering, The Ohio State University, Columbus, Ohio Tito Homem-de-Mello School of Business, Universidad Adolfo Iban ez, Santiago, Chile Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 2 / 30

Introduction Linear programs with uncertainty Linear programs with uncertainty Many optimization problems can be formulated as linear programs: min b T x s.t. Ax c. (LP) Suppose there is some uncertainty in the coefficients A and c. For example, the constraint Ax c could represent total energy production must satisfy demand, but Demand is uncertain. Actual produced amount from each energy source is a (random) percentage of the planned amount. What to do? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 3 / 30

Dealing with uncertainty Introduction Linear programs with uncertainty Some possibilities: Impose that constraint Ax c must be satisfied regardless of the outcome of A and c. Impose that constraint Ax c must be satisfied with some probability, i.e., solve min {b T x : P(Ax c) 1 α} for some small α > 0. Penalize the expected constraint violation, i.e., solve min b T x + µe[max{c Ax, 0}] for some µ > 0. Difficulty: How to solve any of these formulations? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 4 / 30

Introduction The need for approximation Linear programs with uncertainty Even before we think of optimization methods to solve the above problems, we need to deal with an even more basic issue: How to compute quantities such as P(Ax c) or E[max{c Ax, 0}]? Very hard to do! (except in special cases) We need to approximate these quantities with something we can compute. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 5 / 30

Introduction Estimating means The estimation problem: an example Suppose we have a vector of m random variables X := (X 1,..., X m ) and we want to calculate g := E[G(X )] = E[G(X 1,..., X m )], where G is a function that maps m-dimensional vectors to the real numbers. Example: find the expected completion time of a project. G(X ) = max{x 1, X 2 + X 5, X 3 + X 4 + X 5 } Project has 3 components, given by activities 1 2 and 5 3, 4 and 5 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 6 / 30.

The estimation problem Introduction Estimating means How to do that? Suppose that each variable X k can take r possible values, denoted xk 1,..., x k r. If we want to compute the exact value, we have to compute E[G(X )] = r r... k 1 =1 k 2 =1 r k m=1 G(x k 1 1,..., x km m ) P(X 1 = x k 1 1,..., X m = x km m ) In the above example, suppose each variable can take r = 10 values. If the travel times are independent, then we have a total of 10 5 = 100, 000 possible outcomes for G(X )! Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 7 / 30

The estimation problem Introduction Estimating means Imagine now this project: Path 1: 1-2-6-8-7-18 Path 2: 1-2-6-8-16-18 Path 3: 1-3-4-11-10-16-18 Path 4: 1-3-4-5-6-8-7-18 Path 5: 1-3-4-5-6-8-16-18 Path 6: 1-3-4-5-9-8-7-18 Path 7: 1-3-4-5-9-8-16-18 Path 8:1-3-4-5-9-10-16-18 Path 9:1-3-12-11-10-16-18 Path10:1-3-12-13-24-21-20-18 Path11:1-3-12-13-24-21-22-20-18 Path12:1-3-12-13-24-23-22-20-18 It is totally impractical to calculate the exact value! The problem is even worse if the distributions are continuous. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 8 / 30

The need for scenarios Introduction Estimating means The example shows that we need a method that can help us approximate distributions with a finite (and not too large) set of scenarios. Issues: How to select such a set of scenarios? What guarantees can be given about the quality of the approximation? As we shall see, there are two classes of approaches: Sampling methods Deterministic methods Each class requires its own tools to answer the two questions above. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 9 / 30

Introduction Estimating means The estimation problem via sampling Idea: Let X j := (X j 1,..., X j m) denote one sample from the random vector X. Draw N independent and identically distributed (iid) samples X 1,..., X N. Compute ĝ N := 1 N N G(X j ). j=1 Recall the Strong Law of Large Numbers: as N goes to infinity, lim N ĝn = E[G(X )] with probability one (w.p.1) so we can use ĝ N as an approximation of g = E[G(X )]. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 10 / 30

Introduction Assessing the quality of the mean estimate Assessing the quality of the approximation ISSUE: ĝ N is a random variable, since it depends on the sample. That is, in one experiment ĝ N may be close to g while in another it may differ from g by a large amount! Example: 200 runs of the completion time problem with N = 50. Función de densidad de probabilidad 0,26 0,24 0,22 0,2 0,18 0,16 f(x) 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 19 20 21 22 23 24 x 25 26 27 28 29 Histograma Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 11 / 30

Introduction Assessing the quality of the mean estimate Assessing the quality of the approximation ISSUE: ĝ N is a random variable, since it depends on the sample. That is, in one experiment ĝ N may be close to g while in another it may differ from g by a large amount! Example: 200 runs of the completion time problem with N = 50. Función de densidad de probabilidad 0,26 0,24 0,22 0,2 0,18 0,16 f(x) 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 19 20 21 22 23 24 x 25 26 27 28 29 Histograma Normal Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 12 / 30

Introduction The Central Limit Theorem Assessing the quality of the mean estimate Note that E[ĝ N ] = E 1 N N G(X j ) = j=1 1 N N E [ G(X j ) ] = g. j=1 Also, Var(ĝ N ) = Var 1 N N G(X j ) = 1 j=1 N 2 N j=1 Var(G(X j )) = 1 Var(G(X )). N The Central Limit Theorem asserts that, for N sufficiently large, N (ĝn g) Normal(0, 1), σ where σ 2 = Var(G(X )). Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 13 / 30

Introduction Assessing the quality of the mean estimate Computing the margin of error of the estimate The CLT implies that ( P ĝ N 1.96 σ g ĝ N + 1.96 σ ) N N = 0.95. That is, out of 100 experiments, on average in 95 of those the interval given by [ ĝ N 1.96 σ, ĝ N + 1.96 σ ] N N will contain the true value g. The above interval is called a 95% confidence interval for g. Note that σ 2 is usually unknown. Again, when N is large enough we can approximate σ 2 with N ( SN 2 := j=1 G(X j ) 2 ) ĝ N. N 1 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 14 / 30

Introduction Deterministic approximation of the mean The estimation problem via deterministic approximation One idea is to approximate the distribution of each X i with a discrete distribution with small number of points (say, 3 points). But even then we have to sum up 3 m terms! Also, it is difficult to assess the quality of the approximation... How about quadrature rules to approximate integrals (e.g., Simpson s rule)? They work well for low-dimensional problems. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 15 / 30

The SAA approach From estimation to optimization The basic idea Consider a generic stochastic optimization problem of the form min {g(x) := E[G(x, ξ)]}, (SP) x X where: G is a real-valued function representing the quantity of interest (cost, revenues, etc.). The inputs for G are the decision vector x and a random vector ξ that represents the uncertainty in the problem. X is the set of feasible points. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 16 / 30

The SAA approach The need for approximation The basic idea As before, if G is not a simple function, or if ξ is not low-dimensional, then we need to approximate the problem, since we cannot evaluate g(x) exactly. As before, we can use either sampling or deterministic approximations. Issue: What is the effect of the approximation on the optimal value and/or optimal solutions of the problem? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 17 / 30

The SAA approach The basic idea The newsvendor problem, revisited Newsvendor purchases papers in the morning at price c and sells them during the day at price r Unsold papers are returned at the end of the day for salvage value s. If we want to maximize the expected revenue, then we have to solve where min {g(x) := E[G(x, ξ)]}, x 0 G(x, ξ) := cx + r min{x, ξ} + s(x min{x, ξ}) = (s c)x + (r s) min{x, ξ} Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 18 / 30

The SAA approach Approximation with sampling The basic idea As we saw before, we can approximate the value of g(x) (for each given x) with a sample average. That is, for each x X we can draw a sample {ξx, 1..., ξx N } from the distribution of ξ, and approximate g(x) with g N (x) := 1 N G(x, ξ j N x). But: It is useless to generate a new approximation for each x! j=1 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 19 / 30

The SAA approach The basic idea The Sample Average Approximation approach The idea of the Sample Average Approximation (SAA) approach is to use the same sample for all x. That is, we draw a sample {ξ 1,..., ξ N } from the distribution of ξ, and approximate g(x) with ĝ N (x) := 1 N N G(x, ξ j ). j=1 Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 20 / 30

The SAA approach The basic idea The Sample Average Approximation approach We can see that the approximation is very close to the real function. This suggests replacing the original problem with min ĝn(x), x X which can be solved using a deterministic optimization algorithm! Questions: Does that always work, i.e. for any function G(x, ξ)? What is a good sample size to use? What can be said about the quality of the solution returned by the algorithm? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 21 / 30

Asymptotic properties The SAA approach Asymptotic properties of SAA Let us study first what happens as the sample size N goes to infinity. It is important to understand what that means. Consider the following hypothetical experiment: We draw a sample of infinite size, call it {ξ 1, ξ 2,...}. We call that a sample path. Then, for each N, we construct the approximation ĝ N ( ) = 1 N N G(, ξ j ) j=1 using the first N terms of that sample path, and we solve min ĝn(x). (SP N ) x X Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 22 / 30

Asymptotic properties The SAA approach Asymptotic properties of SAA Let and ˆx N := an optimal solution of (SP N ) S N := the set of optimal solutions of (SP N ) ν N := the optimal value of (SP N ) x S ν := an optimal solution of (SP) := the set of optimal solutions of (SP) := the optimal value of (SP) As the sample size N goes to infinity, does ˆx N converge to some x? S N converge to the set S? ν N converge to ν? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 23 / 30

The SAA approach Asymptotic properties of SAA Asymptotic properties, continuous distributions We illustrate the asymptotic properties with the newsvendor problem. We will study separately the cases when demand ξ has a continuous and a discrete distribution. Suppose first demand has an Exponential(10) distribution. 3 2.5 2 1.5 true function N=10 N=30 N=90 N=270 g N (x) 1 0.5 0 0.5 1 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 24 / 30

The SAA approach Asymptotic properties of SAA Asymptotic properties, continuous distributions It seems the functions ĝ N are converging to g. The table lists the values of ˆx N and ν N (N = corresponds to the true function): N 10 30 90 270 ˆx N 1.46 1.44 1.54 2.02 2.23 ν N -1.11-0.84-0.98-1.06-1.07 So, we see that ˆx N x and ν N ν! Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 25 / 30

The SAA approach Asymptotic properties of SAA Asymptotic properties, discrete distributions Now let us look at the case when ξ has a discrete distribution. Suppose demand has discrete uniform distribution on {1, 2,..., 10}. g N (x) 2 1.5 1 0.5 0 0.5 true function N=10 N=30 N=90 N=270 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 26 / 30

The SAA approach Asymptotic properties of SAA Asymptotic properties, discrete distributions Again, it seems the functions ĝ N are converging to g. The table lists the values of ˆx N and ν N (N = corresponds to the true function): N 10 30 90 270 ˆx N 2 3 3 2 [2,3] ν N -2.00-2.50-1.67-1.35-1.50 We see that ν N ν. However, ˆx N does not seem to be converging at all. On the other hand, ˆx N is oscillating between two optimal solutions of the true problem! How general is this conclusion? Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 27 / 30

Convergence result The SAA approach Asymptotic properties of SAA We can see from both figures that ĝ N ( ) converges uniformly to g( ). Uniform convergence occurs for example when the functions are convex. The following result is general: Theorem When uniform convergence holds, we have the following results: 1 ν N ν with probability one (w.p.1), 2 Suppose that there exists a compact set C such that (i) S C and S N C w.p.1 for N large enough, and (ii) the objective function is finite and continuous on C. Then, dist(s N, S ) 0 w.p.1. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 28 / 30

The SAA approach Convergence result (cont.) Asymptotic properties of SAA What does convergence with probability one means? Recall that the functions ĝ N in the above example were constructed from a single sample path. The theorem tells us that, regardless of the sample path we pick, we have convergence as N! So, let us repeat the above experiment (only for N = 270) multiple times, each time with a different sample path: 2.5 2 2 1.5 1.5 1 1 0.5 g N (x) 0.5 0 g N (x) 0 0.5 0.5 1 1 1.5 1.5 2 2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x (a) Exponential demand 2.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x (b) Discrete uniform demand Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 29 / 30

The SAA approach Convergence result (cont.) Asymptotic properties of SAA We see that for some sample paths we have a very good approximation for this N, (in this case, N = 270) but for others we don t. Why? Don t we have convergence for all sample paths? The problem is the theorem only guarantees convergence as N. So, for some path we quickly get a good approximation, whereas for others we may need a larger N to achieve the same quality. So, if we pick one sample of size N and solve min ĝ N (x) as indicated by the SAA approach, how do we know if we are on a good or on a bad sample path? The answer is...we don t! So, we need to have some probabilistic guarantees. Bayraksan (OSU) & Homem-de-Mello (UAI) Scenario Generation and Sampling SVAN IMPA May 9 30 / 30