Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

Size: px

Start display at page:

Download "Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming"

Kristian Nash
6 years ago
Views:

1 Stochastic Dual Dynamic Programg Algorithm for Multistage Stochastic Programg Final presentation ISyE 8813 Fall 2011 Guido Lagos Wajdi Tekaya Georgia Institute of Technology November 30, 2011

2 Multistage Stochastic Linear Programg A 1 x 1 =b 1 x 1 0 c1 T x 1 +E Analysis of Stochastic Dual Dynamic Programg Method Alex Shapiro (2010) B 2 x 1 +A 2 x 2 =b 2 x 2 0 c2 T x 2 + E... + E B T x T 1 +A T x T =b T x T 0 Applications: planning problems in ing, energy, forestry, etc. ct T x T... Challenges: Tractability of E Stagewise dependence of data process {ξ t := (c t, B t, A t, b t)} t=1,...,t Curse of dimensionality

3 Summary 1 Two-stage stochastic programg Cutting-plane method SDDP algorithm for two-stage SP 2 Convergence and main contributions

4 Two-stage stochastic programg 1 Two-stage stochastic programg Cutting-plane method SDDP algorithm for two-stage SP 2 Convergence and main contributions

5 Two-stage stochastic programg True problem: c T x + E Ax=b x 0 T x+w y=h y 0 q T y Take random sample ξ 1,..., ξ N and approximate E 1 N N j=1 = Sample Average Approximation (SAA) problem: c T x + 1 N N q jt y Ax=b x 0 j=1 T j x+w j y=h j y 0 } {{ } } Q(x,ξ j ):= {{ } Q(x):=

6 Two-stage stochastic programg Cutting-plane method Basic Ingredients c T x + 1 N Ax=b x 0 N Q(x, ξ j ) j=1 } {{ } Q(x):= where Q(x, ξ) := max Assume: relatively complete recourse, i.e. feasible x, Q(x, ξ) < a.s. 1 N N j=1 Q(, ξj ) convex piecewise-linear, and problem is T x+w y=h y 0 q T y!!! For f convex: f (x 0) := { d : x f (x) f (x 0) + d T (x x } 0) Q(, [ ξ)(x 0) = T T {π : π opt. sol. of dual of Q(x 0, ξ)} ] 1 N N j=1 Q(, ξj ) (x 0) = 1 N N j=1 T j T {π :. π opt. sol. of dual of Q(x 0, ξ j )}

7 Two-stage stochastic programg Cutting-plane method Basic Ingredients c T x + 1 N Ax=b x 0 N Q(x, ξ j ) j=1 } {{ } Q(x):= where Q(x, ξ) := max Assume: relatively complete recourse, i.e. feasible x, Q(x, ξ) < a.s. Conclusion: c T x + 1 N Q(x, ξ j ) is: N j=1 easy to compute on given x difficult to optimize easy to compute subgradient on given x T x+w y=h y 0 q T y

8 Two-stage stochastic programg Cutting-plane method Cutting Plane Algorithm Given sample { ξ j = (q j, T j, W j, h j ) } j=1,...,n, 0. k 1; LB 1 ; UB 1 ; Q 1 (x) LB 1 x 1. ( Forward Step ) Let x k be the solution of: LB k c T x + Q k (x) Ax=b x 0 2. ( Backward Step ) Compute: Q(x k ) 1 N N j=1 Q(x k, ξ j ) g k 1 N N j=1 T jt π j,k (subgradient) Let UB k c T x k + Q(x k ). If UB k LB k ɛ, END. Else (LB k < UB k ), add plane Q(x k ) + g kt (x x k ) to Q k : Q k+1 (x) max{q k (x), Q(x k ) + g kt (x x k )} 3. k k + 1, iterate from 1.

9 Two-stage stochastic programg SDDP algorithm for two-stage SP SDDP algorithm I Given sample { ξ j = (q j, T j, W j, h j ) } j=1,...,n, 0. k 1; UB 1 ; LB 1 ; Q 1 (x) LB 1 x 1. Forward Step 1.1 Let x k be the solution of: LB k c T x + Q k (x) Ax=b x Take subsample {ξ (j) } M j=1 of {ξj } N j=1 (N >> M), and with values {ϑ j := c T x k + Q(x k, ξ (i) )} M j=1 compute (1 α) confidence upper bound of true problem opt. value ϑ : UB k ϑ + z α/2 σ ϑ / M where ϑ := 1 M M j=1 ϑ j and σ 2 ϑ := 1 M 1 M j=1 (ϑ j ϑ) If UB k LB k ɛ, END.

10 Two-stage stochastic programg SDDP algorithm for two-stage SP SDDP algorithm II 2. Backward Step 2.1 Compute: Q(x k ) 1 N N j=1 Q(x k, ξ j ) = 1 N N j=1 g k 1 N N j=1 T jt π j,k (subgradient) 2.2 Add plane Q(x k ) + g kt ( x k ) to Q k ( ): max T j x k +W j y=h j y 0 Q k+1 (x) max{q k (x), Q(x k ) + g kt (x x k )} 3. k k + 1, iterate from 1. q jt y

11 1 Two-stage stochastic programg Cutting-plane method SDDP algorithm for two-stage SP 2 Convergence and main contributions

12 True problem A 1 x 1 =b 1 x 1 0 c1 T x 1 +E B 2 x 1 +A 2 x 2 =b 2 x 2 0 c2 T x 2 + E... + E Equivalently (Dynamic Programg equations): where A 1 x 1 =b 1 x 1 0 c T 1 x 1 + E [Q 2(x 1, ξ 2)] }{{} Q 2 (x 1 ):= B T x T 1 +A T x T =b T x T 0 ct T x T Q t(x t 1, ξ t) := inf B t x t 1 +A t x t=b t x t 0 Q T (x T 1, ξ T ) := inf B T x T 1 +A T x T =b T x T 0 ct T x t + E [Q t+1(x t, ξ t+1)] }{{} Q t+1 (x t) ct T x T t = 2,..., T 1 Assumptions 1 Process is stagewise independent, i.e. ξ t+1 indep. of ξ 1,..., ξ t. 2 Problem has relatively complete recourse

13 SAA problem Take random sample { ξj t = ( c tj, Ãtj, B tj, b } tj) for each stage j=1,...,n t t = 2,..., T. SAA problem is: A 1 x 1 =b 1 x 1 0 c1 T x 1+ 1 N 2 N 2 j=1 B j2 x 1 +Ãj2x 2 = b j2 x 2 0 N T c j2 T x N 3 N N j=1 T j=1 Equivalently (Dynamic Programg equations): where A 1 x 1 =b 1 x 1 0 c1 T x N 2 Q 2,j(x 1) N 2 j=1 } {{ } Q 2 (x 1 ) B jt x T 1 +ÃjT x T = b jt x T 0 c jt T x T Q t,j(x t 1) := B tj x t 1 +Ãtj x t= b tj x t 0 Q T,j(x T 1) := B Tj x T 1 +ÃTj x T = b Tj x T 0 c T tj x t + 1 c T Tjx T N t+1 N t+1 j=1 Q t+1,j(x t) } {{ } Q t+1 (x t) t = 2,..., T 1

14 SDDP method: the idea!!! Cost-to-go functions Q t (x t 1 ) = 1 N t N t j=1 Q t,j (x t 1 ) t = 2,..., T are convex piecewise-linear Approximate A 1x 1 =b 1 x 1 0 B tj x t 1 +Ãtj xt= b tj x t 0 c1 Tx 1 + Q 2 (x 1 ) by A 1x 1 =b 1 x 1 0 c tj Tx t + Q t+1 (x t ) by B tj x t 1 +Ãtj xt= b tj x t 0 c T 1 x 1 + Q 2 (x 1 ) c T tj x t + Q t+1 (x t ) where Q 2 ( ),..., Q T ( ) are convex, piecewise-linear and Q t ( ) Q t ( ) t = 2,..., T In successive iterations, refine lower approximations Q t ( ) using subgradient of Q t ( ): Q k t ( ) Q k+1 t ( ) Q k+2 t ( )... Q t ( )

15 SDDP method: Forward step At iteration k 1, we have lower approximations Q 2,..., Q T Take subsample {( ξ (j) (j) 2,..., ξ T )}M j=1 of original sample For j = 1,..., M, take sampled process ( ξ (j) (j) 2,..., ξ ) and solve T A1x1=b1 x1 0 ξ (j) t, x t 1,j B t(j) x t 1,j +Ãt(j) xt = b t(j) xt 0 ξ (j) T, x T 1,j B T (j) x T 1,j +ÃT (j) x T = b T (j) x T 0 c1 T x1 + Q2(x1) c t(j) T xt + Qt+1(xt) c T T (j) x T x 1j x tj ( ξ (j) [t] ) t = 2,..., T 1 x T j ( ξ (j) [T ] ) obtaining candidate policy values x 1j, x 2j,..., x T j with cost T ϑ j ct(j) T x tj t=1 It s a (1 α) confidence upper bound of true opt. value ϑ : UB k ϑ + z α σ ϑ / M where ϑ := 1 M M j=1 ϑ j and σ ϑ 2 := 1 M M 1 j=1 (ϑ j ϑ) 2.

16 SDDP method: Backward step We have candidate values x 1, x 2,..., x T. At stage t = T : For x T 1 = x T 1 and for j = 1,..., N T, solve: Q T,j (x T 1 ) := B T,j x T 1 +ÃT,j x T = b T,j x T 0 and let π T,j be opt. dual solution. Let 1 NT Q T (x T 1 ) := N T Add cut g T := 1 N T c T T,jx T Q j=1 T,j (x T 1 ) NT B j=1 T T,j π T,j. L T (x T 1 ) := Q T (x T 1 ) + g T T (x T 1 x T 1 ) to lower approx. Q T used in stage t = T 1: Q T ( ) := max{q T ( ), L T ( )}.

17 SDDP method: Backward step At stage t = T 1: For x T 2 = x T 2 and for j = 1,..., N T 1, solve: Q T 1,j (x T 2 ) := B T 1,j x T 2 +ÃT 1,j x T 1 = b T 1,j x T 1 0 and let π T 1,j be opt. dual solution. Let Add cut c T T 1,jx T 1 +Q T (x T 1 ) 1 NT Q T 1 (x T 2 ) := 1 N T 1 j=1 Q T 1,j (x T 2 ) g T 1 := 1 NT 1 N T 1 j=1 B T T 1,j π T 1,j. L T 1 (x T 2 ) := Q T 1 (x T 2 ) + g T T 1(x T 2 x T 2 ) to lower approx. Q T 1 used in stage t = T 2: Q T 1 ( ) := max{q T 1 ( ), L T 1 ( )}.

18 SDDP method: Backward step At stage t = T 2: At stage t = 1: solve LB k A 1 x 1 =b 1 x 1 0 c T 1 x 1 + Q 2 (x 1 ) LB k is, on average, lower bound to ϑ true optimal value

19 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: True problem

20 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SAA problem

21 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Forward step

22 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Forward step

23 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Forward step

24 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Forward step

25 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Backward step

26 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Backward step

27 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 1: Backward step

28 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Forward step

29 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Forward step

30 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Forward step

31 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Backward step

32 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Backward step

33 SDDP method: Illustration (a) Stage 1 (b) Stage 2 (c) Stage 3 Figure: SDDP iteration 2: Backward step...

34 Summary: KEY IDEAS for SDDP algorithm Forward step: sample process ξ 1,..., ξ T implementable policy x 1,..., x T Repetitions Upper bound on true optimal value Backward step: Cost-to-go functions Q t (x t 1 ) = 1 N t N t max c t,jx T t + Q t+1 (x t ) j=1 B t,j x t 1+Ãt,j x t= b t,j x t 0 are convex piecewise-linear functions of x t 1 Refine lower approximations Q t ( ) using subgradient [ ] Qt ( ) (x t 1 ) = 1 N t N t j=1 Lower bound on true optimal value B T t,j {π t,j : opt. sol. of dual... }

35 Convergence and main contributions Proposition (Convergence) Assume i. At the forward step, process subsamples are taken independently of each other ii. At all iterations, approximated problems A 1 x 1 =b 1 x 1 0 c1 T x 1 + Q 2 (x 1 ) and B t,j x t 1 +Ãt,j x t = b t,j x t 0 have finite optimal value iii. In the backward step, basical solutions are used c T t,jx t + Q t+1 (x t ) Then, after a sufficiently but finite large amount of iterations of the SDDP algorithm, the forward procedure defines an optimal policy for the SAA problem.

36 Convergence and main contributions Tractability? Issue: total number of scenarios Π T t=2 N t Cutting plane for two-stage: bad SDDP algorithm: generalization of cutting plane method even worse? One run of Backward step: solve 1 + N N T LP s One run of Forward step: solve 1 + M(T 1) LP s Tractability : SDDP method construct feasible policy

37 Convergence and main contributions THE END

Worst-case-expectation approach to optimization under uncertainty

Worst-case-expectation approach to optimization under uncertainty Wajdi Tekaya Joint research with Alexander Shapiro, Murilo Pereira Soares and Joari Paulo da Costa : Cambridge Systems Associates; : Georgia