On Complexity of Multistage Stochastic Programs

Similar documents
High Dimensional Edgeworth Expansion. Applications to Bootstrap and Its Variants

A No-Arbitrage Theorem for Uncertain Stock Model

Scenario Generation and Sampling Methods

arxiv: v1 [math.st] 18 Sep 2018

MATH3075/3975 FINANCIAL MATHEMATICS TUTORIAL PROBLEMS

Approximation of Continuous-State Scenario Processes in Multi-Stage Stochastic Optimization and its Applications

The ruin probabilities of a multidimensional perturbed risk model

Asymptotic results discrete time martingales and stochastic algorithms

Assessing Policy Quality in Multi-stage Stochastic Programming

Approximations of Stochastic Programs. Scenario Tree Reduction and Construction

arxiv: v1 [q-fin.pm] 13 Mar 2014

AMH4 - ADVANCED OPTION PRICING. Contents

Non replication of options

Multistage risk-averse asset allocation with transaction costs

Worst-case-expectation approach to optimization under uncertainty

Martingales. by D. Cox December 2, 2009

Solving real-life portfolio problem using stochastic programming and Monte-Carlo techniques

Optimal Security Liquidation Algorithms

DASC: A DECOMPOSITION ALGORITHM FOR MULTISTAGE STOCHASTIC PROGRAMS WITH STRONGLY CONVEX COST FUNCTIONS

Math 416/516: Stochastic Simulation

On Existence of Equilibria. Bayesian Allocation-Mechanisms

Forecast Horizons for Production Planning with Stochastic Demand

Introduction to Probability Theory and Stochastic Processes for Finance Lecture Notes

The rth moment of a real-valued random variable X with density f(x) is. x r f(x) dx

Optimal Allocation of Policy Limits and Deductibles

4 Martingales in Discrete-Time

Convergence Analysis of Monte Carlo Calibration of Financial Market Models

STAT 830 Convergence in Distribution

Monte Carlo Methods for Uncertainty Quantification

Equivalence between Semimartingales and Itô Processes

4: SINGLE-PERIOD MARKET MODELS

Lesson 3: Basic theory of stochastic processes

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMSN50)

IEOR E4703: Monte-Carlo Simulation

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.

Fast Convergence of Regress-later Series Estimators

Rohini Kumar. Statistics and Applied Probability, UCSB (Joint work with J. Feng and J.-P. Fouque)

Financial Risk Forecasting Chapter 9 Extreme Value Theory

The stochastic calculus

Numerical schemes for SDEs

Information, Interest Rates and Geometry

Dynamic Replication of Non-Maturing Assets and Liabilities

Advanced Probability and Applications (Part II)

OPTIMAL PORTFOLIO CONTROL WITH TRADING STRATEGIES OF FINITE

Homework Assignments

ROM SIMULATION Exact Moment Simulation using Random Orthogonal Matrices

1 Rare event simulation and importance sampling

American Option Pricing Formula for Uncertain Financial Market

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

3.4 Copula approach for modeling default dependency. Two aspects of modeling the default times of several obligors

3.2 No-arbitrage theory and risk neutral probability measure

IEOR E4703: Monte-Carlo Simulation

Pricing Volatility Derivatives with General Risk Functions. Alejandro Balbás University Carlos III of Madrid

CONVERGENCE OF OPTION REWARDS FOR MARKOV TYPE PRICE PROCESSES MODULATED BY STOCHASTIC INDICES

The value of foresight

MATH 5510 Mathematical Models of Financial Derivatives. Topic 1 Risk neutral pricing principles under single-period securities models

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error

MTH6154 Financial Mathematics I Stochastic Interest Rates

c 2014 CHUAN XU ALL RIGHTS RESERVED

Estimation of Value at Risk and ruin probability for diffusion processes with jumps

SPDE and portfolio choice (joint work with M. Musiela) Princeton University. Thaleia Zariphopoulou The University of Texas at Austin

Asymptotic methods in risk management. Advances in Financial Mathematics

Risk Minimization Control for Beating the Market Strategies

Lecture 4. Finite difference and finite element methods

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Viability, Arbitrage and Preferences

IEOR E4602: Quantitative Risk Management

HIGHER ORDER BINARY OPTIONS AND MULTIPLE-EXPIRY EXOTICS

1 Mathematics in a Pill 1.1 PROBABILITY SPACE AND RANDOM VARIABLES. A probability triple P consists of the following components:

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

Limit Theorems for the Empirical Distribution Function of Scaled Increments of Itô Semimartingales at high frequencies

Variable-Number Sample-Path Optimization

Yao s Minimax Principle

On Using Shadow Prices in Portfolio optimization with Transaction Costs

Monte Carlo Methods in Option Pricing. UiO-STK4510 Autumn 2015

Stock Loan Valuation Under Brownian-Motion Based and Markov Chain Stock Models

GAME THEORY. Department of Economics, MIT, Follow Muhamet s slides. We need the following result for future reference.

On the Distribution of Multivariate Sample Skewness for Assessing Multivariate Normality

S t d with probability (1 p), where

A Continuity Correction under Jump-Diffusion Models with Applications in Finance

PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA

ELEMENTS OF MONTE CARLO SIMULATION

induced by the Solvency II project

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Chapter 3: Black-Scholes Equation and Its Numerical Evaluation

Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4.

Optimal Dam Management

Convergence of trust-region methods based on probabilistic models

Chapter 6. Importance sampling. 6.1 The basics

No-arbitrage theorem for multi-factor uncertain stock model with floating interest rate

Model-independent bounds for Asian options

12 The Bootstrap and why it works

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Stochastic Optimization with cvxpy EE364b Project Final Report

Blackwell Optimality in Markov Decision Processes with Partial Observation

Performance of Stochastic Programming Solutions

Regression estimation in continuous time with a view towards pricing Bermudan options

Transcription:

On Complexity of Multistage Stochastic Programs Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA e-mail: ashapiro@isye.gatech.edu Abstract In this paper we derive estimates of the sample sizes required to solve a multistage stochastic programming problem with a given accuracy by the (conditional sampling) sample average approximation method. The presented analysis is self contained and is based on a, relatively elementary, one dimensional Cramér s Large Deviations Theorem. Key words: stochastic programming, Monte Carlo sampling, sample average method, large deviations exponential bounds, complexity.

Introduction Consider the following stochastic programming problem } f(x) :=E[F (x, ξ)], (.) Min x X where ξ is a random vector supported on a set Ξ R d, the expectation in (.) is taken with respect to a (known) probability distribution of ξ, X is a nonempty subset of R n and F : X Ξ R. In the case of two-stage stochastic programming, the function F (x, ξ) is given as the optimal value of a corresponding second stage problem. In that case the assumption that F (x, ξ) is real valued for all x X and ξ Ξ can only hold if the corresponding recourse is relatively complete. Only in very specific situations the expected value function f(x) can be written in a closed form. Therefore, it should be calculated by a numerical integration. Already for the number of random variables d 5 it is typically impossible to evaluate the corresponding multidimensional integral (expectation) with a high accuracy. This makes stochastic programming problems of the form (.) really difficult. A way of estimating the expected value function is suggested by the Monte Carlo method. That is, a random sample ξ,..., ξ N of N realizations of ξ is generated, and the expected value function f(x) is approximated by the sample average function ˆf N (x) :=N N i= F (x, ξi ). This is the basic idea of the so-called sample average approximation (SAA) method. It is possible to show, under mild regularity conditions, that for >0 and α (0, ) the sample size N O()σ2 2 [ n log ( ) ( )] DL O() + log α (.2) guarantees that any /2-optimal solution of the SAA problem is an -optimal solution of the true problem with probability at least α (see [3, 7, 8]). Here O() is a generic constant, D is the diameter of the set X (assumed to be finite), L is Lipschitz constant of f(x) and σ 2 is a certain constant measuring variability of the objective function F (x, ξ). Recall that for >0 it is said that x is an -optimal solution of problem (.) if x X and f( x) inf x X f(x)+. In a sense the estimate (.2) of the sample size gives a bound on complexity of solving, with a specified probability, the (true) problem (.) by using the sample average approximation. Note that the estimated sample size grows linearly in the dimension n of the first stage problem and is proportional to the squared ratio of the variability coefficient σ to the desired accuracy. (The following Example shows that this estimate cannot be significantly improved.) This indicates that one may expect to solve the true problem (.) with a manageable sample size to a reasonable accuracy by using the SSA method. And, indeed, this was verified in various numerical experiments (cf., [4, 5, 9]). Example Consider problem (.) with F (x, ξ) := x 2k 2k ξ,x, where k is a positive integer, X := x R n : x }, x, y denotes the standard scalar product of two

vectors x, y R n and x = x, x. Suppose, further, that random vector ξ has normal distribution N(0,σ 2 I n ), where σ 2 is a positive constant, i.e., components ξ i of ξ are independent and ξ i N(0,σ 2 ), i =,..., n. It follows that f(x) = x 2k, and hence for [0, ] the set of -optimal solutions of the true problem (.) is x : x 2k }. Now let ξ,..., ξ N be an iid random sample of ξ and ξ N := (ξ +... + ξ N )/N. The corresponding sample average function is ˆf N (x) = x 2k 2k ξ N,x, and the optimal solution ˆx N of the SAA problem is ˆx N = ξ N γ ξn, where γ := 2k 2 if ξ 2k N, and γ =if ξ N >. It follows that, for (0, ), the optimal solution of the corresponding SAA problem is an -optimal solution of the true problem iff ξ N ν, where ν := 2k. 2k We have that ξ N N(0,σ 2 N I n ), and hence N ξ N 2 /σ 2 has the chi-square distribution with n degrees of freedom. Consequently, the probability that ξ N ν >is equal to the probability P ( χ 2 n >N 2/ν /σ 2). Moreover, E[χ 2 n]=n and P(χ 2 n >n) increases and tends to /2 asn increases, e.g., P(χ 2 > )=0.373, P(χ 2 2 > 2)=0.3679, P(χ 2 3 > 3)=0.396 and etc. Consequently, for α (0, 0.3) and (0, ), for example, the sample size N should satisfy N> nσ2 (.3) 2/ν in order to have the property: with probability α an (exact) optimal solution of the SAA problem is an -optimal solution of the true problem. Compared with (.2), the lower bound (.3) also grows linearly in n and is proportional to σ 2 / 2/ν. It remains to note that the constant ν decreases to one as k increases. The aim of this paper is to extend this analysis of the SAA method to the multistage stochastic programming (MSP) setting. A discussion of complexity of the MSP can be found in [8]. It was already argued there that complexity of the SAA method, when applied to the MSP, grows fast with increase of the number of stages and seemingly simple MSP problems can be computationally unmanageable. We estimate sample sizes, required to solve the true problem with a given accuracy, by using tools of the Large Deviations (LD) theory (see, e.g., [2] for a thorough discussion of the LD theory). In that respect our analysis is self contained and rather elementary since we only employ the upper bound of the (one dimensional) Cramér s LD Theorem. That is, if X,..., X N is a sequence of iid realizations of random variable X and X N := N N i= X i is the corresponding average, then Here P(A) denotes probability of event A, } I(z) :=sup tz log M(t) t R P( X N a) e NI(a). (.4) is the so-called rate function, and M(t) :=E[e tx ] is the moment generating function of random variable X. 2

Let us make the following simple observation which will be used in our derivations. Let X and Y be two random variables and R. We have that if X and Y 2, where + 2 =, then X + Y, and hence ω : X(ω)+Y (ω) >} ω : X(ω) > } ω : Y (ω) > 2 }. This implies the following inequality for the corresponding probabilities P(X + Y>) P(X > )+P(Y > 2 ). (.5) 2 Sample average approximations of multistage stochastic programs Consider the following T stage stochastic programming problem [ [ Min F (x )+E inf F 2 (x 2,ξ 2 )+E + E [ inf F T (x T,ξ T ) ]]] (2.) x 2 X 2 (x,ξ 2 ) x T X T (x T,ξ T ) driven by the random data process ξ 2,..., ξ T. Here x t R nt, t =,..., T, are decision variables, F t : R nt R dt R are continuous functions and X t : R n t R dt R nt, t = 2,..., T, are measurable multifunctions, the function F : R n R and the set X R n are deterministic. We assume that the set X is nonempty. For example, in linear case F t (x t,ξ t ):= c t,x t, X := x : A x = b,x 0}, X t (x t,ξ t ):=x t : B t x t + A t x t = b t,x t 0},t=2,..., T, ξ := (c,a,b ) is known at the first stage (and hence is nonrandom), and ξ t := (c t,b t,a t,b t ) R dt, t =2,..., T, are data vectors some (all) elements of which can be random. In the sequel we use ξ t to denote random data vector and its particular realization. Which one of these two meanings will be used in a particular situation will be clear from the context. If we denote by Q 2 (x,ξ 2 ) the optimal value of the (T ) stage problem: Min x 2 X 2 (x,ξ 2 ) F 2(x 2,ξ 2 )+E [ + E [ min x T X T (x T,ξ T ) F T (x T,ξ T ) ]], (2.2) then we can write the T stage problem (2.) in the following form of two-stage programming problem Min F (x )+E [ Q 2 (x,ξ 2 ) ]. (2.3) Note, however, that if T > 2, then problem (2.2) in itself is a stochastic programming problem. Consequently, if the number of scenarios involved in (2.2) is very large, or 3

infinite, then the optimal value Q 2 (x,ξ 2 ) can be calculated only approximately, say by sampling. For the sake of simplicity we make the following derivations for the 3-stage problem, i.e., we assume that T = 3 (it will be clear how the obtained results can be extended to an analysis of T>3). In that case Q 2 (x,ξ 2 ) is given by the optimal value of the problem Min F 2 (x 2,ξ 2 )+E[Q 3 (x 2,ξ 3 ) ξ 2 ], (2.4) x 2 X 2 (x,ξ 2 ) where the expectation is taken with respect to the conditional distribution of ξ 3 given ξ 2 and Q 3 (x 2,ξ 3 ) := We make the following assumption: inf F 3 (x 3,ξ 3 ). x 3 X 3 (x 2,ξ 3 ) For every the expectation E [ Q 2 (x,ξ 2 ) ] is well defined and finite valued. Of course, finite valuedness of E [ Q 2 (x,ξ 2 ) ] can only holds if Q 2 (x,ξ 2 ) is finite valued for a.e. ξ 2, which in turn implies that X 2 (x,ξ 2 ) is nonemty for a.e. ξ 2 etc. That is, the above assumption implies that the recourse is relatively complete. Now let ξ2, i i =,..., N, be a random sample of independent realizations of the random vector ξ 2. We can approximate problem (2.3) by the following SAA problem } Min ˆf N (x ):=F (x )+ N Q 2 (x,ξ i N 2). (2.5) Since Q 2 (x,ξ2) i are not given explicitly we need to estimate these values by conditional sampling (note that in order for the SAA method to produce consistent estimators, conditional sampling is required, see [6]). That is, we generate random sample ξ ij 3, j =,..., N 2,ofN 2 independent realizations according to conditional distribution of ξ 3 given ξ2, i i =,..., N. Consequently, we approximate Q 2 (x,ξ2)by i } ˆQ 2,N2 (x,ξ2) i := inf F 2 (x 2,ξ2)+ i N 2 Q 3 (x 2,ξ ij x 2 X 2 (x,ξ2 i ) 3 ). (2.6) N 2 Finally, we approximate the true (expected value) problem (2.3) by the following socalled Sample Average Approximating (SAA) problem } Min f N,N 2 (x ):=F (x )+ N ˆQ 2,N2 (x,ξ i N 2). (2.7) The above SAA problem is obtained by approximating the objective function f(x ):=F (x )+E [ Q 2 (x,ξ 2 ) ] of problem (2.3) with f N,N 2 (x ). 4 i= i= j=

3 Sample size estimates In order to proceed with our analysis we need to estimate the probability P sup f(x ) f N,N 2 (x ) } > (3.) for an arbitrary constant >0. To this end we use the following result about Large Deviations (LD) bounds for the uniform convergence of sample average approximations. Consider a function h : X Ξ R and the corresponding expected value function φ(x) := E[h(x, ξ)], where the expectation is taken with respect to the probability distribution P of the random vector ξ = ξ(ω), X is a nonempty closed subset of R n and Ξ R d is the support of the probability distribution P. Assume that for every x X the expectation φ(x) is well defined, i.e., h(x, ) is measurable and P -integrable. Let ξ,..., ξ N be an iid sample of the random vector ξ(ω), and ˆφ N (x) := N N i= h(x, ξi ) be the corresponding sample average function. Theorem Suppose that the set X has finite diameter D, and the following conditions hold: (i) there exists a constant σ>0 such that M x (t) exp σ 2 t 2 /2 }, t R, x X, (3.2) where M x (t) is the moment generating function of the random variable h(x, ξ) φ(x), (ii) there exists a constant L>0 such that Then for any >0, P h(x,ξ) h(x, ξ) L x x, ξ Ξ, x,x X. (3.3) sup x X ˆφN (x) φ(x) } ( ) n DL O() exp N2 6σ 2 }. (3.4) To make the paper self contained we give a proof of this theorem in the appendix. We can apply the LD bound (3.4) to obtain estimates of the probability (3.). We have that sup f(x ) f N,N 2 (x ) sup f(x ) ˆf N (x ) + sup ˆfN (x ) f N,N 2 (x ), and hence, by (.5), P sup f(x ) f N,N 2 (x ) } > P sup f(x ) ˆf N (x ) } >/2 + P sup ˆfN (x ) f N,N 2 (x ) } >/2. (3.5) 5

Note that f(x ) ˆf N (x )=E[Q 2 (x,ξ 2 )] N Q 2 (x,ξ N 2), i i= and ˆf N (x ) f N,N 2 (x )= N [ Q 2 (x,ξ N 2) i ˆQ ] 2,N2 (x,ξ2) i. i= Let us assume, for the sake of simplicity, the between stages independence of the random process. That is, suppose that the following condition holds. (A) Random vectors ξ 2 and ξ 3 are independent. Of course, under this condition the conditional expectation in formula (2.4) does not depend on ξ 2. Also in that case the conditional sample ξ ij 3 has the (marginal) distribution of ξ 3 and is independent of ξ2, i and can be generated in two ways. Namely, we can either generate the same random sample ξ ij 3 for each i =,..., N, or these samples can be generated independently of each other. Let us make further the following assumptions. (A2) The set X has finite diameter D. (A3) There is a constant L > 0 such that Q 2 (x,ξ 2 ) Q 2 (x,ξ 2 ) L x x for all x, and a.e. ξ 2. (A4) There exists constant σ > 0 such that for any it holds that M,x (t) exp σ 2 t 2 /2 }, t R, where M,x (t) is the moment generating function of Q 2 (x,ξ 2 ) E[Q 2 (x,ξ 2 )]. (A5) There is a positive constant D 2 such that for every and a.e. ξ 2 the set X 2 (x,ξ 2 ) has a finite diameter less than or equal to D 2. (A6) There is a constant L 2 > 0 such that F2 (x 2,ξ 2 ) F 2 (x 2,ξ 2 )+Q 3 (x 2,ξ 3 ) Q 3 (x 2,ξ 3 ) L2 x 2 x 2 for all x 2,x 2 X 2 (x,ξ 2 ), and a.e. ξ 2 and ξ 3. 6

(A7) There exists constant σ 2 > 0 such that for any x 2 X 2 (x,ξ 2 ) and all and a.e. ξ 2 it holds that M 2,x2 (t) exp σ 2 2t 2 /2 }, t R, where M 2,x2 (t) is the moment generating function of Q 3 (x 2,ξ 3 ) E[Q 3 (x 2,ξ 3 )]. By Theorem, under assumptions (A2) (A4), we have that P sup f(x ) ˆf N (x ) } ( ) n D L >/2 O() exp O()N } 2. (3.6) σ 2 For ξ 2 and random sample ξ3,..., ξ N 2 3 of N 2 independent replications of ξ 3, consider function ˆψ N2 (x 2,ξ 2 ):=F 2 (x 2,ξ 2 )+ N 2 Q 3 (x 2,ξ N 3) j 2 and its expected value j= ψ(x 2,ξ 2 )=F 2 (x 2,ξ 2 )+E[Q 3 (x 2,ξ 3 )]. By Theorem, under assumptions (A5) (A7), we have that for any, P sup ˆψN2 (x 2,ξ 2 ) ψ(x 2,ξ 2 ) } >/2 C (N 2 ), (3.7) x 2 X 2 (x,ξ 2 ) where ( ) n2 D2 L 2 C (N 2 )=O() exp O()N } 2 2. σ2 2 It follows that } P inf ˆψ N2 (x 2,ξ 2 ) inf ψ(x 2,ξ 2 ) x 2 X 2 (x,ξ 2 ) x 2 X 2 (x,ξ 2 ) >/2 C (N 2 ). (3.8) Note that inf x 2 X 2 (x,ξ 2 ) ψ(x 2,ξ 2 )=Q 2 (x,ξ 2 ), and for ξ 2 = ξ2, i inf ˆψ N2 (x 2,ξ 2 )= ˆQ 2,N2 (x,ξ2). i x 2 X 2 (x,ξ 2 ) It follows from (3.8) that (for both strategies of using the same or independent samples for each ξ2) i the following inequality holds P ˆfN (x ) f N,N 2 (x ) } >/2 C (N 2 ). (3.9) Suppose further that: 7

There is L 3 > 0 such that ˆf N ( ) f N,N 2 ( ) is Lipschitz continuous on X with constant L 3. Then by constructing a ν-net in X and using (3.9) it can be shown (in a way similar to the proof of Theorem in the Appendix) that P sup ˆfN (x ) f N,N 2 (x ) } >/2 O() ( D L 3 ) n ( D2 L 2 ) } n2 exp O()N 2 2. σ 2 x X 2 (3.0) Combining (3.5) with estimates (3.6) and (3.0) gives an upper bound for the probability (3.). Let us also observe that if ˆx is an /2-optimal solution of the SAA problem (2.7) and sup x X f(x ) f N,N 2 (x ) /2, then ˆx is an -optimal solution of the true problem (2.3). Therefore, we obtain the following result. Theorem 2 Under the specified assumptions and for >0and α (0, ), and the sample sizes N and N 2 satisfying [ ( O() D ) } L n exp O()N 2 + ( ) D L n ( 3 D2 ) }] L n2 2 exp O()N 2 2 α, (3.) σ 2 we have that any /2-optimal solution of the SAA problem (2.7) is an -optimal solution of the true problem (2.3) with probability at least α. In particular, suppose that N = N 2. Then for L := maxl,l 2,L 3 }, D := maxd,d 2 } and σ := maxσ,σ 2 } we can use the following estimate of the required sample size N = N 2 : ( ) n +n DL 2 O() exp O()N } 2 α, (3.2) σ 2 which is equivalent to 4 Discussion N O()σ2 2 [ (n + n 2 ) log ( ) ( DL O() + log α σ 2 2 )]. (3.3) The estimate (3.3), for 3-stage programs, looks similar to the estimate (.2) for two-stage programs. Note, however, that if we use the SAA method with conditional sampling and respective sample sizes N and N 2, then the total number of scenarios is N = N N 2. Therefore, our analysis seems to indicate that for 3-stage problems we need random samples with the total number of scenarios of order of the square of the corresponding sample size for two-stage problems. This analysis can be extended to T -stage problems with the conclusion that the total number of scenarios needed to solve the true problem with a 8

reasonable accuracy grows exponentially with increase of the number of stages T. Some numerical experiments seem to confirm this conclusion (cf., []). Of course, it should be mentioned that the above analysis does not prove in a rigorous mathematical sense that complexity of multistage programming grows exponentially with increase of the number of stages. It only indicates that the SAA method, which showed a considerable promise for solving two stage problems, could be practically inapplicable for solving multistage problems with a large (say greater than 5) number of stages. Our analysis was performed under several simplifying assumptions. In particular, we consider a 3-stage setting and assumed the between stages independence condition. An extension of the analysis from 3 to a higher number of stages is straightforward. Removing the between stages independence assumption may create technical difficulties and requires a further investigation. 5 Appendix Proof of Theorem. By the LD bound (.4) we have that for any x X and >0it holds that } P ˆφN (x) φ(x) exp NI x ()}, (5.) where I x (z) :=sup zt log Mx (t) } (5.2) t R is the rate function of random variable h(x, ξ) φ(x). Similarly } P ˆφN (x) φ(x) exp NI x ( )}, and hence P ˆφN (x) φ(x) } exp NI x ()} + exp NI x ( )}. (5.3) For a constant ν>0, let x,..., x M X be such that for every x X there exists x l, l,..., M}, such that x x l ν. Such set x,..., x M } is called a ν-net in X.We can choose this net in such a way that M O()(D/ν) n, where D := sup x,x X x x is the diameter of X and O() is a generic constant. By (3.3) we have that φ(x ) φ(x) L x x (5.4) and ˆφN (x ) ˆφ N (x) L x x, (5.5) 9

for any x, x X. It follows by (5.3) that ( P max ˆφN ( x l ) φ( x l ) ) ( M P l= ˆφN ( x l ) φ( x l ) }) l M M ( P ˆφN ( x l ) φ( x l ) ) 2 M exp N[I xl () I xl ( )]}. l= l= (5.6) For an x X consider l(x) arg min l M x x l. By construction of the ν-net we have that x x l(x) ν for every x X. Then ˆφN (x) φ(x) ˆφN (x) ˆφ N ( x l(x) ) + ˆφN ( x l(x) ) φ( x l(x) ) + φ( x l(x) ) φ(x) Lν + ˆφN ( x l(x) ) φ( x l(x) ) + Lν. Let us take now a ν-net with such ν that Lν = /4, i.e., ν := [/(4L)]. Then P ˆφN (x) φ(x) } P ˆφN ( x l ) φ( x l ) } /2, sup x X max l M which together with (5.6) implies that P ˆφN (x) φ(x) } 2 M exp N [ I xl (/2) I xl ( /2) ]}. (5.7) sup x X l= Moreover, because of the condition (i) we have that log M x (t) σ 2 t 2 /2, and hence I x (z) z2, z R. (5.8) 2σ2 It follows from (5.7) and (5.8) that P sup x X ˆφN (x) φ(x) } } N 2M exp 2. (5.9) 6σ 2 Finally, since M O()(D/ν) n = O()(DL/) n, we obtain that (5.9) implies (3.4), and hence the proof is complete. References [] J. Blomvall and A. Shapiro, Solving multistage asset investment problems by Monte Carlo based optimization, E-print available at: http://www.optimization-online.org, 2004. [2] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Springer- Verlag, New York, NY, 998. 0

[3] A. J. Kleywegt, A. Shapiro, and T. Homem-De-Mello, The sample average approximation method for stochastic discrete optimization, SIAM Journal of Optimization, 2 (200), 479 502. [4] J. Linderoth, A. Shapiro, and S. Wright, The empirical behavior of sampling methods for stochastic programming, Annals of Operations Research, to appear. [5] W. K. Mak, D. P. Morton, and R. K. Wood, Monte Carlo bounding techniques for determining solution quality in stochastic programs, Operations Research Letters, 24 (999), 47 56. [6] A. Shapiro, Inference of statistical bounds for multistage stochastic programming problems, Mathematical Methods of Operations Research, 58 (2003), 57 68. [7] A. Shapiro, Monte Carlo sampling methods. In: A. Rusczyński and A. Shapiro (editors), Stochastic Programming, volume 0 of Handbooks in Operations Research and Management Science, North-Holland, 2003. [8] A. Shapiro and A. Nemirovski, On complexity of stochastic programming problems, E-print available at: http://www.optimization-online.org, 2004. [9] B. Verweij, S. Ahmed, A.J. Kleywegt, G. Nemhauser, and A. Shapiro, The sample average approximation method applied to stochastic routing problems: a computational study, Computational Optimization and Applications, 24 (2003), 289 333.