Robust Dual Dynamic Programming

Size: px

Start display at page:

Download "Robust Dual Dynamic Programming"

Sharon Miller
5 years ago
Views:

1 1 / 18 Robust Dual Dynamic Programming Angelos Georghiou, Angelos Tsoukalas, Wolfram Wiesemann American University of Beirut Olayan School of Business 31 May 217

2 2 / 18 Inspired by SDDP Stochastic optimization - Optimizes expected value - Needs to know distribution min E P [f (x, ξ)] x Robust optimization - Optimizes for the worst case - Works with uncertainty sets Nested Benders min x max f (x, ξ) ξ Ξ SDDP RDDP

3 3 / 18 minimize max ξ Ξ Multistage Robust Optimization T qt x t (ξ t ) t=1 subject to T t x t 1 (ξ t 1 ) + W t x t (ξ t ) H t ξ t ξ Ξ, t x t (ξ t ) R nt, ξ t Ξ t, ξ t = (ξ 1,, ξ t ) Optimize over decision policies x t ( ). Ξ t polyhedral uncertainty sets. Constraints entangle consecutive stages. Infinite number of variables and constraints.

4 3 / 18 minimize max ξ Ξ Multistage Robust Optimization T qt x t (ξ t ) t=1 ( [ T ]) E qt x t (ξ t ) subject to T t x t 1 (ξ t 1 ) + W t x t (ξ t ) H t ξ t ξ Ξ, t t=1 x t (ξ t ) R nt, ξ t Ξ t, ξ t = (ξ 1,, ξ t ) Optimize over decision policies x t ( ). Ξ t polyhedral uncertainty sets. Constraints entangle consecutive stages. Infinite number of variables and constraints.

5 4 / 18 Relatively complete recourse Assumptions - any partial feasible solution x 1,..., x t can be extended to a complete solution x 1,..., x T - Can be addressed using feasibility cuts Right hand side uncertainty only - T t x t 1 (ξ t 1 ) + W t x t (ξ t ) H t ξ t - Ensures finite convergence - Makes problem easier - Relationship with Nested Benders/SDDP easier to see Both assumptions can be lifted

6 Nested Formulation The multistage problem can be expressed through a nested formulation [ ] min q1 x 1 + x 1 X 1 max min ξ 2 Ξ 2 x 2 X 2 (x 1,ξ 2 ) q 2 x max min ξ T Ξ T x T X T (x T 1,ξ T ) q T x T }{{} First stage problem t stage problem Q 3 (x 2 ) } {{ } Q 2 (x 1 ) min x 1 R n 1 Q t (x t 1 ) = max min ξ t Ξ t x t R n t q 1 x 1 + Q 2 (x 1 ) W 1 x 1 h 1 q t x t + Q t+1 (x t ) T t x t 1 + W t x t H t ξ t 5 / 18

7 6 / 18 Nested Formulation Q 2 Q 3 Q T Q T +1 min x 1 R n 1 q 1 x 1 + Q 2 (x 1 ) W 1 x 1 h 1 Q t (x t 1 ) = max min ξ t Ξ t x t R n t q t x t + Q t+1 (x t ) T t x t 1 + W t x t H t ξ t Optimal value of inner problem convex in ξ t.

8 6 / 18 Nested Formulation Q 2 Q 3 Q T Q T +1 min x 1 R n 1 q 1 x 1 + Q 2 (x 1 ) W 1 x 1 h 1 Q t (x t 1 ) = max min ξ t ext Ξ t x t R n t q t x t + Q t+1 (x t ) T t x t 1 + W t x t H t ξ t Optimal value of inner problem convex in ξ t. We can replace Ξ t with ext Ξ t. Problem decomposes. If only we knew the value functions...

9 Nested Benders Decomposition FP ξ2 ξ3 x3 min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 x2 x3 x1 x2 x3 max ξ2 extξ2 [ min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 ] x3 BP Maintain one (outer approximation of a ) value function per node. Traverse the scenario tree forwards and backwards. FP: At every node, solve and decide where to refine. Move x t forward. We refine at all nodes, i.e., for all scenarios. BP: introduce Benders cuts to refine outer approximations. 7 / 18

10 Nested Benders Decomposition FP ξ2 ξ3 x3 min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 x2 x3 x1 X2(ξ2, x1) x2 x3 max ξ2 extξ2 [ min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 ] x3 BP Maintain one (outer approximation of a ) value function per node. Traverse the scenario tree forwards and backwards. FP: At every node, solve and decide where to refine. Move x t forward. We refine at all nodes, i.e., for all scenarios. BP: introduce Benders cuts to refine outer approximations. But cuts are valid for all nodes of a stage. 7 / 18

11 8 / 18 Towards SDDP: Cut Sharing ξ2 ξ3 FP min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2 ξ2 x1 x2 x3 X2(ξ2, x1) max ξ2 extξ2 [ min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 ] BP Maintain one approximation per stage. But which scenario to propagate forwards? Where to refine the approximation? Exponential number of end-to-end choices.

Towards SDDP: Cut Sharing ξ2 ξ3 FP min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2 ξ2 x1 x2 x3 X2(ξ2, x1) max ξ2 extξ2 [ min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 ] BP

12 Towards SDDP: Cut Sharing ξ2 ξ3 FP min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2 ξ2 x1 x2 x3 X2(ξ2, x1) max ξ2 extξ2 [ min x2 R n 2 subject to q 2 x2 + Q 3 (x2) Tt x1 + W2x2 H2ξ2 ] BP Maintain one approximation per stage. But which scenario to propagate forwards? Where to refine the approximation? Exponential number of end-to-end choices. SDDP Solution (for stochastic programming): pick at random! Small number of refinements. Good performance in practice. No deterministic upper bound/termination criterion. Stochastic convergence. 8 / 18

9 / 18 Robust Dual Dynamic Programming Not all scenarios are important. Pick worst case scenarios. Maintain both inner and outer approximations.

13 9 / 18 Robust Dual Dynamic Programming Not all scenarios are important. Pick worst case scenarios. Maintain both inner and outer approximations. In the FP: - use inner approximations to choose scenarios. - use outer approximations to choose decisions(points of refinement). In the BP refine both inner and outer approximations.

14 1 / 18 Where to Refine? Forward pass. Minimizing a convex function Maximizing a convex function ξt f = arg max min x f t = ξ t Ξ t x t R n t arg min x t R n t q t x t + Q t+1 (x t ) T t x f t 1 + W tx t H t ξ t q t x t + Q t+1 (x t ) T t x f t 1 + W tx t H t ξ f t

15 1 / 18 Where to Refine? Forward pass. Minimizing a convex function Maximizing a convex function ξt f = arg max min x f t = ξ t Ξ t x t R n t arg min x t R n t q t x t + Q t+1 (x t ) T t x f t 1 + W tx t H t ξ t q t x t + Q t+1 (x t ) T t x f t 1 + W tx t H t ξ f t

16 11 / 18 ξ b t = How to Refine? Backward Pass. arg max min ξ t Ξ t x t R n t qt x t + Q t+1 (x t ) T t xt 1 f + W tx t H t ξ t with corresponding inner solution x b t. Add (x f t 1, q t x b t + Q t+1 (x b t )) to the description of Q t By solving (the dual of) Q t (xt 1 f ) = min x t R n t q t x t + Q t+1 (x t ) T t x f t 1 + W tx t H t ξ b t Q t with a hyperplane at x f t 1. - Subgradients of perturbation functions Lagrange multipliers. refine

17 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x1 f = / 43

18 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 5.9 Stage 2. Update Region. [ 15., 5.] 17 / 43

19 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 5.9 Stage 2. Update Region. [ 15., 5.] x f 2 = / 43

20 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 5.9 Stage 2. Update Region. [ 15., 5.] Stage 3. Update Region. [12., 15.] x f 2 = / 43

21 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 5.9 Stage 2. Update Region. [ 15., 5.] x f 2 = 5. x f 3 = 12. Stage 3. Update Region. [12., 15.] 2 / 43

22 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Region. [ 15., 5.] x f 2 = 5. Stage 3. Update Region. [12., 15.] Stage 2. Update Q 3 (-5.,32.) x f 3 = / 43

23 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 2 = 5. Stage 3. Update Region. [12., 15.] x f 3 = 12. Stage 2. Update Q 3 Stage 2. Update Q 3 4. x x 2 (-5.,32.) 22 / 43

24 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 3. Update Region. [12., 15.] x f 3 = 12. Stage 2. Update Q 3 (-5.,32.) Stage 2. Update Q 3 Stage 1. Update Q 2 (5.9,2.2) 4. x x 2 23 / 43

25 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 3 = 12. Stage 2. Update Q 3 (-5.,32.) Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q 2 Stage 1. Update Q 2 2. x x 1 (5.9,2.2) 24 / 43

26 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Q 3 (-5.,32.) Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q2 (5.9,2.2) x f 1 = 1. Stage 1. Update Q 2 2. x x 1 25 / 43

27 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q2 (5.9,2.2) Stage 1. Update Q 2 2. x x 1 Stage 2. Update Region. [ 15., 1.9] x f 1 = / 43

28 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 1. Update Q 2 (5.9,2.2) Stage 1. Update Q 2 2. x x 1 x f 1 = 1. x f 2 = 1.9 Stage 2. Update Region. [ 15., 1.9] 27 / 43

29 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 1. Update Q 2 2. x x 1 x f 1 = 1. Stage 2. Update Region. [ 15., 1.9] Stage 3. Update Region. [59., 15.] x f 2 = / 43

30 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 1. Stage 2. Update Region. [ 15., 1.9] x f 2 = 1.9 x f 3 = 59. Stage 3. Update Region. [59., 15.] 29 / 43

31 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Region. [ 15., 1.9] x f 2 = 1.9 Stage 3. Update Region. [59., 15.] Stage 2. Update Q 3 (1.9,15.4) x f 3 = / 43

32 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 2 = 1.9 Stage 3. Update Region. [59., 15.] x f 3 = 59. Stage 2. Update Q 3 Stage 2. Update Q 3 4. x x 2 (1.9,15.4) 31 / 43

33 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 3. Update Region. [59., 15.] x f 3 = 59. Stage 2. Update Q 3 (1.9,15.4) Stage 2. Update Q 3 Stage 1. Update Q 2 (-1.,35.4) 4. x x 2 32 / 43

34 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 3 = 59. Stage 2. Update Q 3 (1.9,15.4) Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q 2 Stage 1. Update Q 2 2. x x 1 (-1.,35.4) 33 / 43

35 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Q 3 (1.9,15.4) Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q2 (-1.,35.4) x f 1 = 1.1 Stage 1. Update Q 2 2. x x 1 34 / 43

36 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q2 (-1.,35.4) Stage 1. Update Q 2 2. x x 1 Stage 2. Update Region. [ 15., 2.] x f 1 = / 43

37 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 1. Update Q 2 (-1.,35.4) Stage 1. Update Q 2 2. x x 1 x f 1 = 1.1 x f 2 = 2. Stage 2. Update Region. [ 15., 2.] 36 / 43

38 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 1. Update Q 2 2. x x 1 x f 1 = 1.1 Stage 2. Update Region. [ 15., 2.] Stage 3. Update Region. [2., 15.] x f 2 = / 43

39 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 1 = 1.1 Stage 2. Update Region. [ 15., 2.] x f 2 = 2. x f 3 = 2. Stage 3. Update Region. [2., 15.] 38 / 43

40 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 2. Update Region. [ 15., 2.] x f 2 = 2. Stage 3. Update Region. [2., 15.] Stage 2. Update Q 3 (2.,-6.) x f 3 = / 43

41 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 2 = 2. Stage 3. Update Region. [2., 15.] x f 3 = 2. Stage 2. Update Q 3 Stage 2. Update Q 3 4. x x 2 (2.,-6.) 4 / 43

42 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x Stage 3. Update Region. [2., 15.] x f 3 = 2. Stage 2. Update Q 3 (2.,-6.) Stage 2. Update Q 3 Stage 1. Update Q 2 (-1.1,-3.8) 4. x x 2 41 / 43

43 2. x 1 + Q 2 (x 1 ) 4. x 2 + Q 3 (x 2 ) 1. x x f 3 = 2. Stage 2. Update Q 3 (2.,-6.) Stage 2. Update Q 3 4. x x 2 Stage 1. Update Q 2 Stage 1. Update Q 2 2. x x 1 (-1.1,-3.8) 42 / 43

44 Numerical Results: Inventory Control 13 / 18

45 Numerical Results Scalability w.r.t. horizon T={1, 5, 1} - 5 products - 4 random variables per stage (24 = 16 scenarios) optimization time (secs) optimization time (secs) relative distance % relative distance % relative distance % , 1,5 2, optimization time (secs) RDDP scales better than LDR w.r.t. the horizon 14 / 18

46 15 / 18 Numerical Results Scalability w.r.t. products ={1, 15, 2} - horizon T=1-4 random variables per stage (2 4 = 16 scenarios) relative distance % 5 5 relative distance % 5 5 relative distance % optimization time (secs) , optimization time (secs) 1 2, 4, 6, 8, optimization time (secs) RDDP does not solve the curse of dimensionality But, can address problems of practical importance

47 16 / 18 Scalability w.r.t. random variables ={5, 7, 9} - i.e., scenarios per stage= {32, 128, 512} - products= {6, 8, 1} - horizon T=1 Numerical Results relative distance % 5 5 relative distance % 5 5 relative distance % optimization time (secs) optimization time (secs) 1 4, 8, 12, optimization time (secs)

48 17 / 18 Current Work Extension to Stochastic Programming - Same inner approximation, different algorithm - Same deterministic convergence guarantees - Preliminary results indicate comparable complexity Robust Optimization Stochastic Optimization

49 18 / 18 RDDP Summary Converges to optimal solution. Implementable strategy at every iteration(ub). Lower bound available at every iteration. Finite convergence for RHS/ technology matrix uncertainty Deterministic asymptotic convergence for Recourse matrix/objective uncertainty. 2-Stage sub-problems hard. State of the art multistage problems small 2-Stage problems. Stochastic optimization/ distributionally robust optimization

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

Stochastic Dual Dynamic Programg Algorithm for Multistage Stochastic Programg Final presentation ISyE 8813 Fall 2011 Guido Lagos Wajdi Tekaya Georgia Institute of Technology November 30, 2011 Multistage