Stochastic Dual Dynamic Programming

Similar documents
Multistage Stochastic Programming

DASC: A DECOMPOSITION ALGORITHM FOR MULTISTAGE STOCHASTIC PROGRAMS WITH STRONGLY CONVEX COST FUNCTIONS

Robust Dual Dynamic Programming

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

Worst-case-expectation approach to optimization under uncertainty

Multistage risk-averse asset allocation with transaction costs

Assessing Policy Quality in Multi-stage Stochastic Programming

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Optimal Security Liquidation Algorithms

Report for technical cooperation between Georgia Institute of Technology and ONS - Operador Nacional do Sistema Elétrico Risk Averse Approach

Risk aversion in multi-stage stochastic programming: a modeling and algorithmic perspective

Multistage Stochastic Demand-side Management for Price-Making Major Consumers of Electricity in a Co-optimized Energy and Reserve Market

Optimal energy management and stochastic decomposition

King s College London

Dynamic Asset and Liability Management Models for Pension Systems

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Performance of Stochastic Programming Solutions

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

Multistage Stochastic Mixed-Integer Programs for Optimizing Gas Contract and Scheduling Maintenance

IEOR E4703: Monte-Carlo Simulation

Scenario Generation and Sampling Methods

EARLY EXERCISE OPTIONS: UPPER BOUNDS

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Approximation of Continuous-State Scenario Processes in Multi-Stage Stochastic Optimization and its Applications

Implementing an Agent-Based General Equilibrium Model

Markov Decision Processes

Approximations of Stochastic Programs. Scenario Tree Reduction and Construction

Policy iterated lower bounds and linear MC upper bounds for Bermudan style derivatives

Stochastic Dual Dynamic integer Programming

EE266 Homework 5 Solutions

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Confidence Intervals for the Difference Between Two Means with Tolerance Probability

Bandit Learning with switching costs

Summary Sampling Techniques

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs

Quantitative Risk Management

Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models

Decomposition Methods

IEOR E4703: Monte-Carlo Simulation

On solving multistage stochastic programs with coherent risk measures

From Discrete Time to Continuous Time Modeling

Monte Carlo Methods in Structuring and Derivatives Pricing

Investigation of the and minimum storage energy target levels approach. Final Report

Stochastic Programming and Financial Analysis IE447. Midterm Review. Dr. Ted Ralphs

King s College London

Interpolation. 1 What is interpolation? 2 Why are we interested in this?

Chapter 7 Sampling Distributions and Point Estimation of Parameters

Random Tree Method. Monte Carlo Methods in Financial Engineering

IE 495 Lecture 11. The LShaped Method. Prof. Jeff Linderoth. February 19, February 19, 2003 Stochastic Programming Lecture 11 Slide 1

MAFS Computational Methods for Pricing Structured Products

Sampling Distribution

A Branch-and-Price method for the Multiple-depot Vehicle and Crew Scheduling Problem

We formulate and solve two new stochastic linear programming formulations of appointment scheduling

Chapter 2 Uncertainty Analysis and Sampling Techniques

Gamma. The finite-difference formula for gamma is

AMH4 - ADVANCED OPTION PRICING. Contents

Numerical Methods for Pricing Energy Derivatives, including Swing Options, in the Presence of Jumps

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

Week 1 Quantitative Analysis of Financial Markets Distributions B

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

Optimization Models in Financial Mathematics

Multistage Stochastic Programs

Risk Management for Chemical Supply Chain Planning under Uncertainty

Quasi-Convex Stochastic Dynamic Programming

Optimal Dam Management

SIMULATION OF ELECTRICITY MARKETS

Robust Optimization Applied to a Currency Portfolio

Risk Measurement in Credit Portfolio Models

Computational Finance. Computational Finance p. 1

Chapter 7: Estimation Sections

Essays on Some Combinatorial Optimization Problems with Interval Data

On the Optimality of a Family of Binary Trees Techical Report TR

Asset-Liability Management

CSCI 1951-G Optimization Methods in Finance Part 07: Portfolio Optimization

SYSM 6304: Risk and Decision Analysis Lecture 6: Pricing and Hedging Financial Derivatives

Lecture 17. The model is parametrized by the time period, δt, and three fixed constant parameters, v, σ and the riskless rate r.

1. In this exercise, we can easily employ the equations (13.66) (13.70), (13.79) (13.80) and

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Comprehensive Exam. August 19, 2013

Using Monte Carlo Integration and Control Variates to Estimate π

IEOR E4004: Introduction to OR: Deterministic Models

Modelling, Estimation and Hedging of Longevity Risk

Lattice (Binomial Trees) Version 1.2

The stochastic calculus

Yao s Minimax Principle

DM559/DM545 Linear and integer programming

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

Course notes for EE394V Restructured Electricity Markets: Locational Marginal Pricing

Strategies for Improving the Efficiency of Monte-Carlo Methods

c 2014 CHUAN XU ALL RIGHTS RESERVED

Optimal prepayment of Dutch mortgages*

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance

Scenario reduction and scenario tree construction for power management problems

Advanced Numerical Methods

2.1 Mathematical Basis: Risk-Neutral Pricing

Introduction to Sequential Monte Carlo Methods

4 Reinforcement Learning Basic Algorithms

Simple Improvement Method for Upper Bound of American Option

Convergence of Life Expectancy and Living Standards in the World

Transcription:

1 / 43 Stochastic Dual Dynamic Programming Operations Research Anthony Papavasiliou

2 / 43 Contents [ 10.4 of BL], [Pereira, 1991] 1 Recalling the Nested L-Shaped Decomposition 2 Drawbacks of Nested Decomposition and How to Overcome Them 3 Stochastic Dual Dynamic Programming (SDDP) 4 Example

3 / 43 Table of Contents 1 Recalling the Nested L-Shaped Decomposition 2 Drawbacks of Nested Decomposition and How to Overcome Them 3 Stochastic Dual Dynamic Programming (SDDP) 4 Example

The Nested L-Shaped Decomposition Subproblem For each stage t = 1,..., H 1, scenario k = 1,..., K t NLDS(t, k) : min(c t k )T x t k + θt h s.t. W t xk t = ht k T t 1 k x t 1 a(k), (πt k ) D t k,j x t k d t k,j, j = 1,..., r t k, (ρt k ) (1) E t k,j x t k + θt k et k,j, j = 1,..., st k, (σt k ) (2) x t k 0 K t : number of distinct scenarios at stage t a(k): ancestor of scenario k at stage t 1 x t 1 a(k) : current solution from a(k) Constraints (1): feasibility cuts Constraints (2): optimality cuts 4 / 43

Nested L-Shaped Method 5 / 43 Building block: NLDS(t, k): problem at stage t, scenario k Repeated application of the L-shaped method Variants depending on how we traverse the scenario tree πk t, ρt k, σt k Cut t 1, a(k) NLDS(t, k) x t k Trial solution t + 1, D t+1 (k) a(k): ancestor of scenario k D t+1 (k): descendants of scenario k in period t + 1

Example 6 / 43 Node: (t = 1, k = 1) Direction: forward Output: x 1 1

Example 7 / 43 Nodes: (t = 2, k), k {1, 2} Direction: forward Output: xk 2, k {1, 2}

Example 8 / 43 Nodes: (t = 3, k), k {1, 2, 3, 4} Direction: backward Output: (πk 3, ρ3 k, σ3 k ), k {1, 2, 3, 4}

Example 9 / 43 Nodes: (t = 2, k), k {1, 2} Direction: backward Output: (πk 2, ρ2 k, σ2 k ), k {1, 2}

Feasibility Cuts If NLDS(t, k) is infeasible, solver returns π t k, ρt k 0 (πk t )T (hk t T t 1 k x t 1 a(k) ) + (ρt k )T dk t > 0 (π t k )T W t + (ρ t k )T D t k 0 The following is a valid feasibility cut for NLDS(t 1, a(k)): where D t 1 a(k) x d t 1 a(k) D t 1 a(k) = (πk t )T T t 1 k d t 1 a(k) = (πk t )T hk T + (ρt k )T dk t 10 / 43

Optimality Cuts 11 / 43 Solve NLDS(t, k) for j = 1,..., K t 1, then compute E t 1 j = k D t (j) e t 1 j = pk t p t 1 j pk t p t 1 k D t (j) j (π t k )T T t 1 k r t s k k t [(πk t )T hk t + ρ t ki d ki t + σki t et ki ] D t (j): period t descendants of a scenario j at period t 1 Note: p t k p t 1 j = p(k, t j, t 1) i=1 i=1

12 / 43 Table of Contents 1 Recalling the Nested L-Shaped Decomposition 2 Drawbacks of Nested Decomposition and How to Overcome Them 3 Stochastic Dual Dynamic Programming (SDDP) 4 Example

13 / 43 Recombining Scenario Tree When can we recombine nodes? When can we assign the same value function V t+1 (x) to each node k of stage t?

Nested Decomposition Is Non-Scalable 14 / 43 Assume H time steps, M t discrete outcomes in each stage No infeasibility cuts M 1 = 1 M 2 = 2 M 3 = 4 Forward pass: M 1 + M 1 M 2 +... = H t=1 Πt j=1 Mj Backward pass: H 1 t=2 Πt j=1 M j

15 / 43 Was Nested Decomposition any Good? Alternative to nested decomposition is extended form Extended form will not even load in memory Nested decomposition will load in memory, but will not terminate (for large problems) Nested Decomposition lays the foundations for SDDP

16 / 43 Enumerating Versus Simulating 1 1 2 2 3 4 Enumeration: {(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (2, 4))} Simulation (with 3 samples): {(1, 3), (2, 1), (1, 4)}

17 / 43 Making Nested Decomposition Scalable Solution for forward pass In the forward pass, we simulate instead of enumerating This results in a probabilistic upper bound / termination criterion Solutions for backward pass In the backward pass, we share cuts among nodes of the same time period This requires an assumption of serial independence

18 / 43 Serial Independence Serial independence: probability of realization ξi t is constant from all possible (t 1)-stage scenarios P(ξk 3 = c3 k ξ2 j = c1 2) = p k, j 1,..., M 2, k 1,..., M 3 ω 1 ω 2 Problem is identical from t = 2 whether we observe ω 1 or ω 2

19 / 43 Example of Serial Independence (I) Value in circles: realization of ξ t k Value in edges: transition probabilities 0.5 4 0.5 1 0.5 3 0.5 2 0.5 2 0.5 1 Is this tree serially independent?

Example of Serial Independence (II) 20 / 43 0.5 2 0.5 1 0.5 1 0.5 2 0.4 2 0.6 1 Is this tree serially independent?

Example of Serial Independence (III) 21 / 43 0.6 1 0.3 1 0.4 2 0.7 2 0.4 2 0.6 1 Is this tree serially independent?

22 / 43 Implications for Forward Pass At each forward pass we solve H 1 NLDS problems For K Monte Carlo simulations, we solve 1 + K (H 1) linear programs

23 / 43 Implications for Backward Pass Serial independence implies same value function for all nodes of stage t cut sharing For a given trial sequence x t k, we solve H t=2 Mt linear programs, for K trial sequences we solve K H t=2 Mt linear programs

24 / 43 Serial Independence is Helpful, Not Necessary We can use dual multipliers in stage t + 1 for cuts in stage t even without serial independence However, each node in stage t has a different value function More memory More optimality cuts needed because we are approximating more value functions With serial independence, we can get rid of the scenario tree work with continuous distribution of ξ t

25 / 43 Table of Contents 1 Recalling the Nested L-Shaped Decomposition 2 Drawbacks of Nested Decomposition and How to Overcome Them 3 Stochastic Dual Dynamic Programming (SDDP) 4 Example

SDDP Forward Pass 26 / 43 Solve NLDS(1, 1). Let x1 1 be the optimal solution. Initialize ˆx i 1 = x1 1 for i = 1,..., K Repeat for t = 2,..., H, i = 1,..., K Sample a vector hi t from the set hk t, k = 1,..., Mt Solve the NLDS(t, i) with trial decision ˆx t 1 i Store the optimal solution as ˆx i t

SDDP Backward Pass Repeat for t = H, H 1,..., 2 Repeat for i = 1,..., K Repeat for k = 1,..., M t Solve NLDS(t, k) with trial decision ˆx t 1 i Compute E t 1 = pkπ t k,it t t 1 k, e t 1 = pk(π t k,ih t k t + σk,ie t k) t M t k=1 Add the optimality cut M t k=1 E t 1 x + θ e t 1 to every NLDS(t 1, k), k = 1,..., M t 1 27 / 43

28 / 43 Central Limit Theorem Suppose {X 1, X 2,...} is a sequence of independent identically distributed random variables with E[X i ] = µ and Var[X i ] = σ 2 <. Then n (( 1 n n i=1 ) ) X i µ d N(0, σ 2 ).

Probabilistic Upper Bound 29 / 43 Suppose we draw a sample k of (ξk t ), t = 1,..., H and solve NLDS(t, k) for t = 1,..., H This gives us a vector x t k, t = 1,..., H We can compute a cost for this vector z k = t=h ct k x t k If we repeat this K times, we get a distribution of independent, identically distributed costs z k, k = 1,..., K By the Central Limit Theorem, z = 1 K K k=1 z k converges to a Gaussian with standard deviation estimated by σ = 1 K ( K 2 ) ( z z k ) 2 k=1 Each (xk t, t = 1,..., H) is feasible but not necessarily optimal, so ẑ K is an estimate of an upper bound

Bounds and Termination Criterion 30 / 43 After solving NLDS(1, 1) in a forward pass we can compute a lower bound z LB as the objective function value of NLDS(1, 1) After completing a forward pass, we can compute z k = H ck t ˆx k t t=1 z = 1 K z k K k=1 σ = 1 K K 2 (z k z) 2 k=1 Terminate if z LB ( z 2σ, z + 2σ), which is the 95.4% confidence interval of z

Graphical Illustration of Termination Criterion 31 / 43 Objective z z LB z 2σ

Size of Monte Carlo Sample 32 / 43 How can we ensure 1% optimality gap with 95.4% confidence? Choose K such that 2σ 0.01 z Mean and variance depend (asymptotically) on the statistical properties of the process, not K z = 1 K s = 1 K K k=1 z k K (z k z) 2 σ = 1 s K k=1 Set 2 s K ( 0.01 z )2

33 / 43 Full SDDP Algorithm Initialize: z =, σ = 0 Forward pass, store z LB and z. If z LB ( z 2σ, z + 2σ) terminate, else go to backward pass Backward pass Go to forward pass

34 / 43 Table of Contents 1 Recalling the Nested L-Shaped Decomposition 2 Drawbacks of Nested Decomposition and How to Overcome Them 3 Stochastic Dual Dynamic Programming (SDDP) 4 Example

Example 35 / 43 Consider the following problem Produce air conditioners for 3 months 200 units/month at 100 $/unit Overtime costs 300 $/unit Known demand of 100 units for period 1 Equally likely demand, 100 or 300 units, for periods 2, 3 Storage cost is 50 $/unit All demand must be met

36 / 43 Notation xk t : regular production yk t : number of stored units wk t : overtime production d t k : demand What does the scenario tree look like?

Extended Form 2 min x 1 + 3w 1 + 0.5y 1 + pk 2 (x k 2 + 3w k 2 + 0.5y k 2 ) + 4 pk 3 (x k 3 + 3w k 3 ) k=1 s.t. x 1 2 x 1 + w 1 y 1 = 1 y 1 + xk 2 + w k 2 y k 2 = d k 2 k=1 x 2 k 2, k = 1, 2 y 2 a(k) + x 3 k + w 3 k y 3 k = d 3 k x 3 k 2 x t k, w t k, y t k 0, k = 1,..., Kt, t = 1, 2, 3 37 / 43

38 / 43 Optimal solution: Stage 1: x 1 = 2, y 1 = 1 Stage 2, scenario 1: x1 2 = 1, y 1 2 = 1 Stage 2, scenario 2: x2 2 = 2, y 2 2 = 0 Stage 3, scenario 1: x1 3 = 0 Stage 3, scenario 2: x2 3 = 2 Stage 3, scenario 3: x3 3 = 1 Stage 3, scenario 4: x4 3 = 2, l3 4 = 1 What is the cost for each path?

SDDP Upper Bound Computation 39 / 43 param CostRecord{1..MCCount, 1..IterationCount}; let {m in 1..MCCount, i in 1..IterationCount} CostRecord[m, i] := sum{j in Decisions, t in 1..H} c[j]*xtrialrecord[j, t, m, i]; let {m in 1..MCCount} CostSamples[m] := CostRecord[m, IterationCount]; let CostAverage := sum{m in 1..MCCount} CostSamples[m] / MCCount; let CostStDev := sqrt(sum{m in 1..MCCount} (CostSamples[m] - CostAverage) 2 / MCCount 2);

Thinking About the Data CostRecord{1..MCCount, 1..IterationCount} What is the distribution of each column? How does (k, i) entry depend on (k + a, i) entry? Which column is more likely to have a lower average? Which data has a Gaussian distribution? 40 / 43

Distribution of Last Column 41 / 43 z = 6.17, s = 2.02 Not a Gaussian distribution

Moving Average for 5 Iterations Plot: (N, N CostSamples n n=1 N ), MCCount = 100, IterCount = 5 CostStDev = 0.2007: sample standard deviation of last column of CostRecord Note: average cost decreases as iterations increase 42 / 43

How Many Monte Carlo Samples? 43 / 43 2 s 2 2.02 K ( 0.01 z )2 = ( 0.01 6.17 )2 = 4287