Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.
|
|
- Myrtle Carson
- 5 years ago
- Views:
Transcription
1 Outline Coordinate Minimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University November 27, 208 Introduction 2 Algorithms Cyclic order with exact minimization Random order with fixed step size Cyclic order with fixed step size Steepest direction Gauss-Southwell rule with fixed step size Alternatives Summary 3 Examples Linear equations Logistic regression Basic idea coordinate minimization: Compute the next iterative using the update x k+ = x k α ke ik Given a function f : R n R, consider the unconstrained optimization problem We will consider various assumptions on f: nonconvex and differentiable f convex and differentiable f strongly convex and differentiable f minimize fx We will not consider general non-smooth f, because we can not prove anything. We will briefly consider structured non-smooth problems, i.e., problems that use an additional separable regularizer. Notation: f k := fx k and g k := fx k Algorithm General coordinate minimization framework. : Choose x 0 R n and set k 0. 2: loop 3: Choose i k {, 2,..., n}. 4: Choose α k > 0 5: Set x k+ x k α ke ik. 6: Set k k +. 7: end loop α k is the step size. Options include: fixed, but sufficiently small inexact linesearch exact linesearch i k {, 2,..., n} has to be chosen. Options include: cycle through the entire set choose it randomly without replacement choose it randomly with replacement choose it based on which element of fx k is the largest in absolute value e ik is the i k-th coordinate vector this update seeks better points in span{e ik }.
2 The following example shows that coordinate minimization may fail if f is non-smooth: minimize f x, x2 := x + x2 + min{0, x }2 + min{0, x2 }2 + 3 x x2 x,x2 Algorithm 2 Coordinate minimization with cyclic order and exact minimization. : 2: 3: 4: Choose x0 Rn and set k 0. loop Choose ik = modk, n +. Calculate the exact coordinate minimizer: αk argmin f xk αeik α R 5: 6: 7: Set xk+ xk αk eik. Set k k +. end loop Comments: This algorithm assumes that the exact minimizers exist and that they are unique. A reasonable stopping condition should be incorporated, such as k f xk k2 0 6 max{, k f x0 k2 } Figure : Level curves for the function f defined above. Coordinate minimization cannot make progress from any point satisfying x = x2 > 0. Note: Coordinate descent works if the non-smoothness is structured block separable An interesting example introduced by Powell [5, formula 2] is minimize f x, x2, x3 := x x2 + x2 x3 + x x3 + x,x2,x3 3 X xi 2 i= f is continuously differentiable and nonconvex f has minimizers at,, and,, of the unit cube. Coordinate descent with exact minimization started just outside the unit cube near any nonoptimal vertex cycles around neighborhoods of all 6 non-optimal vertices. Powell shows that the cyclic nonconvergence behavior is special and is destroyed by small perturbations on this particular example. Theorem 2. see [6, Theorem 5.32] Assume that the following hold: f is continuously differentiable; the level set L0 := {x Rn : f x f x0 } is bounded; and for every x L0 and all j {, 2,..., n}, the optimization problem minimize f x + ζej ζ R has a unique minimizer. Then, for any limit point x of the sequence {xk } generated by Algorithm 2 that satisfies f x = 0. Figure : Three dimensional example given above. It shows the possible lack of convergence of a coordinate descent method with exact minimization. This example and others in [5] show that we cannot expect a general convergence result for nonconvex functions similar to that for full-gradient descent. This picture was taken from [7]. Proof: Since f xk+ f xk for all k N, we know that the sequence {xk } k=0 L0. Since L0 is bounded that {xk } k=0 has at least one limit point; let x be any such limit point. Thus, there exists a subsequence K N satisfying lim xk = x. k K 2 Combining this with monotonicity of { f xk } and continuity of f also shows that lim f xk = f x and f xk f x for all k. k 3
3 We assume fx 0, and then reach a contradiction. First, consider the subsets K i K, i = 0,..., n defined as K i := {k K : k i mod n}. Since K is an infinite subsequence of the natural numbers, one of the K i must be an infinite set. Without loss of generality, we assume it is K 0 the argument is very similar for any other i, because we are using cyclic order. Let us perform a hypothetical sweep" of coordinate minimization starting from x, so that we would obtain l y := x n, with x l := x + [τ ] je j for all l =,..., n and note that since fx 0 by assumption, we must have fy < fx. why? 4 NOTE: If K i was infinite for some i 0, then we would do above sweep" at x starting with coordinate i and going in cyclic order to cover all n coordinates. Next, notice that by construction of the coordinate minimization scheme, that x k+l = x k + l τ k+j e j for all k K 0 and l n, 5 meaning that x k+l x k = τ k τ k+. τ k+l 2 max{ x : x L 0} < for all k K 0 and l n. We used the assumption that L 0 is bounded. Since this shows that the set {τ k, τ k+,... τ k+n T } k K0 is bounded, we may pass to a subsequence K K 0 with lim k K τ k τ k+. τ k+n = τ L for some τ L R n. 6 Taking the limit of 5 over k K K for each l, and using 2 and 6 we find that lim xk+l = x + k K l [τ L ] je j for each l n. 7 We next claim the following, which we will prove by induction: [τ L ] p = [τ ] p for all p n, and 8 lim k K xk+p = x p for all p n. 9 Base case: p =. We know from the coordinate minimization that fx k+ fx k + τ e for all k and τ R. Taking limits over k K K and using continuity of f, 7 with l =, and 2 yields fx + [τ L ] e = f lim xk+ = lim fxk+ lim fxk + τ e k K k K k K = f lim k K xk + τ e = fx + τ e for all τ R. Since the minimizations in coordinate directions are unique by assumption, we know that [τ L ] = [τ ], which is the first desired result. Also, combining it with 7 gives lim xk+ = x + [τ L ] e = x + [τ ] e x k K, which completes the base base. Induction step: assume that 8 and 9 hold for p p n. We know from the coordinate minimization that fx k+ p+ fx k+ p + τ e p+ for all k and τ R. Taking the limit over k K, continuity of f, 7 with l = p +, and 9 give p+ fx + [τ L ] je j = f lim xk+ p+ = lim fxk+ p+ lim fxk+ p + τ e p+ k K k K k K = f lim k K xk+ p + τ e p+ = fx p + τ e p+ for all τ R. Thus, the definition of x p, and the fact that 8 holds for all p p show that fx p + [τ L ] p+e p+ = fx + = fx + p p [τ ] je j + [τ L ] p+e p+ [τ L ] je j + [τ L ] p+e p+ p+ = fx + [τ L ] je j fx p + τ e p+ for all τ R. Since the minimization in coordinate directions is unique by assumption, we know that [τ L ] p+ = [τ ] p+, which is the first desired result. Also, combining it with 7 gives p+ p+ lim xk+ p+ = x + [τ L ] je j = x + [τ ] je j x p+, k K which completes the proof by induction.
4 Notation: Let L j denote the jth component Lipschitz constant, i.e., it satisfies j f x + te j j fx L j t for all x R n and t. From our induction proof, we have that τ = τ L. Combining this with 7 and the definition of y gives lim xk+n = x + k K n τ L j e j = x + Finally, combining 3, continuity of f, 0, and 4 shows that n τ j e j x n y. 0 fx = lim fxk+n = f lim xk+n = fy < fx, k K k K which is a contradiction. This completes the proof. Let denote the coordinate Lipschitz constant, i.e., it satisfies := max L i. Algorithm 3 Coordinate minimization with random order and a fixed step size. : Choose α 0, /], where 2: Choose x 0 R n and set k 0. 3: loop 4: Choose i k {, 2,..., n} randomly with equal probability. 5: Set x k+ x k α ik fx ke ik. 6: Set k k +. 7: end loop Comments: A reasonable stopping condition should be incorporated, such as fx k max{, fx 0 2} A maximum number of iterations should be included in practice. Theorem 2.2 Suppose that α = / and let the following assumptions hold: f is convex f is globally Lipschitz continuous the minimum value of f is obtained on some set S, i.e., there exists S R n with there exists a scalar R 0 satisfying x S and f := fx = min fx max max { x x S x R x 2 : fx fx 0} R 0 n Then, the iterate sequence {x k} generated by Algorithm 3 satisfies E[ fx k] f 2nLmaxR2 0. k Moreover, if f is strongly convex with parameter σ > 0, i.e., for all {x, y} R n, then fy fx + fx T y x + σ 2 y x 2 2 E[ fx k] f σ k fx0 f. Proof: follows Wright [7] It follows from Taylor s Theorem, definitions of L j and, and α = /, that fx k+ = f x k α ik fx ke ik fx k α ik fx ke T i k fx k + 2 α2 L ik ik fx k 2 = fx k α ik fx k α2 L ik ik fx k 2 fx k α ik fx k α2 ik fx 2 k = fx k α αlmax ik 2 fxk 2 Why? Exercise. = fx k ik fx k 2. 2 If we now take the expectation of both sides with respect to i k, we find that E ik [ fx k+] fx k n Subtracting f from both sides, shows that n j fx k 2 = fx k 2 fx k E ik [ fx k+] f fx k f 2 fx k 2 2.
5 From the previous slide, we have E ik [ fx k+] f fx k f 2 fx k 2 2. Taking expectation with respect to all the random variables {i 0, i, i 2,... }, and defining we find that φ k+ = E [ fx k+ ] f φ k := E[ fx k] f, [ = E i0,i 2,...i Eik k [ fx ] k+ x k] f [ E i0,i,...i k fx k f [ = E fx k f = φ k [ E fx k φ k 2 E [ fx k 2] 2, [ fxk 2 ] ] 2 2 [ fxk 2 ] ] 2 2 ] where we used Jensen s Inequality to derive the last inequality. 4 From the previous slide, we showed that φ k+ φ k 2 E [ fx k 2] 2. 5 Next, note from convexity of f, definition of R 0, and the fact that fx k fx 0 for all k by construction of the algorithm, that we have fx k f fx k T x x k fx k 2 x k x 2 R 0 fx k 2. 6 Taking expectation of both sides shows that Combining this bound with 5 yields φ k+ φ k E [ fx k 2 ] R 0 E[ fx k f ] = R 0 φ k. 2R 2 0 φ 2 k φ k φ k+ Combining this with φ k+ φ k see 5, we have 2R 2 0 φk φk+ φ 2 k 2R 2 0 φk φk+ φ kφ k+ = φ k+ φ k. φ 2 k. 7 From the previous slide, we showed that 2R 2 0 φ k+ φ k for all k. Summing both sides for k = 0,,..., l, shows that l l l = =. φ k+ φ k φ l φ 0 φ l 2R 2 0 2nL k=0 maxr 2 0 k=0 Rearranging, replacing l by k, and using the definition of φ k, this is equivalent to E[ fx k] f = φ k 2nLmaxR2 0 k which is the first desired result. For the second part, assume that f is strongly convex with parameter σ, i.e., that fy fx + fx T y x + σ 2 y x 2 2 for all {x, y} R n. By choosing x = x k and minimizing both sides with respect to y, we find that where y k f = min y R n fy min fxk + y R n fxkt y x k + σ 2 y xk 2 2 = fx k + fx k T y k x k + σ 2 y k x k 2 2 = fx k σ fxk σ fxk 2 2 = fx k 2σ fxk 2 2, := x k fxk. σ On the previous slide we proved that f fx k 2σ fxk Combining this bound with 4 gives the inequality φ k+ φ k E [ fx k 2 ] 2 2 Applying this recursively shows that φ k 2 E [ 2σ fx k f ] = φ k σ E [ fx k f ] = φ k σ φ k nl max = σ φ k φ k σ k φ 0 so that after we use the definition for φ k, we have E[ fx k] f σ k fx 0 f which is the second desired result.
6 Notation: Let L j denote the jth component Lipschitz constant, i.e., it satisfies j f x + te j j fx Lj t for all x R n and t. Let denote the coordinate Lipschitz constant, i.e., it satisfies := max L i. Let L denote the Lipschitz constant for f. Algorithm 4 Coordinate minimization with cyclic order and a fixed step size. : Choose α 0, /]. 2: Choose x 0 R n and set k 0. 3: loop 4: Choose i k = modk, n +. 5: Set x k+ x k α ik fx ke ik. 6: Set k k +. 7: end loop Comments: A reasonable stopping condition should be incorporated, such as fx k max{, fx 0 2} A maximum number of allowed iterations should be included in practice. Theorem 2.3 see [, Theorem 3.6,Theorem 3.9] and [7, Theorem 3] Suppose that α = / and let the following assumptions hold: f is convex f is globally Lipschitz continuous the minimum value of f is obtained on some set S, i.e., there exists S R n with there exists a scalar R 0 satisfying x S and f := fx = min fx max max { x x S x R x 2 : fx fx 0} R 0 n If {x k} is the iterate sequence of Algorithm 4, then for k {n, 2n, 3n,... } we have fx k f 4nLmax + nl2 /L 2 maxr k + 8 If f is strongly convex with parameter σ > 0 see then for k {n, 2n, 3n,... } fx k f k/n σ fx0 f + nl 2 /L 2 max Proof: See [, Theorem 3.6 and Theorem 3.9] and use i each iteration k" in [] is a cycle of n iterations; ii choose in [] the values L i = for all i; iii in [] we have p = since our blocks of variables are singletons, i.e., coordinate descent. Comments on Theorem 2.3: The numerator in 9 is On 2, while the numerator in the analogous result Theorem 2.2 for the random coordinate choice with fixed step size is On. But, Theorem 2.3 is a deterministic result, while Theorem 2.2 is a result in expectation. As part of the homework assignment, you will find out for yourself how these methods perform on a simple quadratic objective function. It can be shown that L n Lj see [3, Lemma 2 with α = ] It follows from the fact that j fx + te j j fx fx + te j fx 2 L t holds for all j, t, and x that L j L. By combining the previous two bullet points, we find that max L j L j n L j so that L n Roughly speaking, L/ is closer to when the coordinates are more decoupled". In light of 9, the complexity result for coordinate descent becomes better as the variables become more decoupled. This makes sense! Notation: Let L j denote the jth component Lipschitz constant, i.e., it satisfies j f x + te j j fx L j t for all x R n and t. Let denote the coordinate Lipschitz constant, i.e., it satisfies := max L i. Algorithm 5 Coordinate minimization with Gauss-Southwell Rule and a fixed step size. : Choose α 0, /]. 2: Choose x 0 R n and set k 0. 3: loop 4: Calculate i k as the steepest coordinate direction, i.e., 5: Set x k+ x k α ik fx ke ik. 6: Set k k +. 7: end loop i k argmax i fx k Comments: A reasonable stopping condition should be incorporated, such as fx k max{, fx 0 2}
7 Theorem 2.4 Suppose that α = / and let the following assumptions hold: f is convex f is globally Lipschitz continuous the minimum value of f is obtained on some set S, i.e., there exists S R n with there exists a scalar R 0 satisfying x S and f := fx = min fx max max { x x S x R x 2 : fx fx 0} R 0 n Then, the iterate sequence {x k} computed from Algorithm 5 satisfies fx k f 2nLmaxR k If f is strongly convex with parameter σ > 0 see then fx k f σ k fx0 f Proof: From earlier see 2, we showed that f k+ fx k ik fx k 2. Combining this with the choice i k argmax i fx k and the standard norm inequality v 2 n v, it holds that f k+ fx k ik fx k 2 = fx k fx k 2 2 fx k fx k Subtracting f from both sides and using the previous fact see 6 that we find that fx k f R 0 fx k 2, f k+ f fx k f 2 fx k 2 2 fx k f Using the notation φ k = fx k f, this is equivalent to φ k+ φ k 2R 2 0 φ 2 k. 2R 2 0 fxk f 2. From the previous slide, we have φ k+ φ k φ 2 2R 2 k 0 which is exactly the same as the inequality 7 except that we now have a different definition of φ k. Then, as shown in that proof, we have which is the desired result for convex f. fx k f = φ k 2nLmaxR2 0 k Next, assume that f is strongly convex, from which earlier we showed see 8 that f fx k 2σ fxk 2 2. Subtracting f from each side of 22 and then using the previous inequality shows that so that f k+ f fx k f 2 fx k 2 2 fx k f fx k f which is the last desired result. σ fxk f = σ k fx0 f σ fxk f Comments so far for fixed step size: Cyclic has the worst dependence on n: Cyclic: On 2 Random and Gauss-Southwell: On Random is a rate in expectation. Gauss-Southwell is a deterministic rate. There is a better analysis for Gauss-Southwell when we assume that f is strongly convex that changes the above comment! See [4]. We show this next.
8 Theorem 2.5 Suppose that α = / and let the following assumptions hold: f is l -strongly convex, i.e., there exists σ > 0 such that fy fx + fx T y x + σ 2 y x 2 for all {x, y} R n f is globally Lipschitz continuous the minimum value of f is obtained Then, the iterate sequence {x k} computed from Algorithm 5 satisfies fx k f k σ fx0 f Proof see [4]: Using l -strong convexity means that fy fx + fx T y x + σ 2 y x 2 for all {x, y} R n for the l -strong convexity parameter σ. If we now minimize both sides with respect to y and replace x by x k, we find that where y k f = minimize y R n minimize y R n fy fx k + fx k T y x k + σ 2 y xk 2 = fx k + fx k T y k x k + σ 2 y k x k 2 why? exercise = fx k 2σ fx k 2 := x k + z k with and l any index satisfying Therefore, we have that [z k ] i := { 0 if i l i fx k σ if i = l l { j : jfx k = fx k }. fx k 2 2σ fxk f. From the previous slide, we showed that fx k 2 2σ fxk f. Subtracting f from both sides of 2 and using the previous inequality shows that f k+ f fx k f fx k 2 fx k f σ fxk f L max = σ fxk f. Applying this inequality recursively gives k fx k f σ fx0 f which is the desired result. For strongly convex functions: Random coordinate choice has the expected rate of E[ fx k] f σ k fx0 f. Gauss-Southwell coordinate choice has the determinstic rate of k fx k f σ fx0 f 23 The bound for Gauss-Southwell is better since so that σ σ n σ σ σ n σ σ σ σ
9 Example: A Simple Diagonal Quadratic Function Consider the problem where minimize g T x + 2 xt Hx H = diagλ, λ 2,..., λ n with λ i > 0 for all i {, 2,..., n}. For this problem, we know that σ = min{λ, λ 2,..., λ n} and σ = n λ i i= Case : For λ = α for some α > 0, the minimum value for σ occurs when α = λ = λ 2 = = λ n, which gives Thus, the convergence constants are: random selection : Gauss-Southwell selection : σ = α and σ = α n. σ σ = = α α Case 2: For this other extreme case, let us suppose that λ = β and λ 2 = λ 3 = = λ n = α with α β. For this case, it can be shown that σ = β and σ = If we now take the limit as α we find that βα n α n + n βα n 2 = βα α + n β. σ = β and σ β = σ Thus, the convergence constants in the limit are: random selection : σ = Gauss-Southwell selection : σ = β β so that Gauss-Southwell is a factor n faster than using a random coordinate selection. so the convergence constants are the same; this is the worst case for Gauss-Southwell. Alternative strongly convex: individual coordinate Lipschitz constants. The iteration update is x k+ = x k + L ik ik fx ke ik Using a similar analysis as before, it can be shown k fx k f σ fx 0 f L ij Better decrease than prior analysis since see 23 k k new rate = σ σ = previous rate L ij faster provided at least one of the used L ij satisfies L ij <. Alternative 2 strongly convex: Lipschitz sampling. Use a random coordinate direction chosen using a non-uniform probability distribution: Pi k = j = L j n l= Ll for all j {, 2,..., n} Using an analysis similar to the previous one, but using the new probability distribution when computing the expectation, it can be shown that E[ fx k+] f σ E[ fxk] f n L with L being the average component Lipschitz constant, i.e., L := n The analysis was first performed in [2]. This rate is faster than uniform random sampling if not all of the component Lipschitz constants are the same. n i= L i
10 Alternative 3 strongly convex: Gauss-Southwell-Lipschitz rule. Choose i k according to the rule i k max i fx k 2 L i 24 Using an argument similar to that which led to 2, it may be shown that fx k+ fx k 2L ik ik fx k 2 The update 24 is designed to choose i k to minimize the guaranteed decrease given by 25, which uses the component Lipschitz constants. It may be shown, using this update, that fx k+ f σ L fx k f where σ L is the strong convexity parameter with respect to v L := n i= Li v i. It is shown in [4, Appendix 6.2] that max { } σ n L, σ σ L σ min {L i} 25 Ordering of constant in linear convergence results when f is strongly convex: Comments: random uniform sampling, < Gauss-Southwell < Gauss-Southwell with {L i} random Lipschitz sampling, {L i} < Gauss-Southwell-Lipschitz max j n i fx k Gauss-Southwell-Lipschitz: the best rate, but is the most expensive per iteration. Better rates if you know and use {L i} instead of just using their max, i.e.,. L i 2 At least as fast as the fastest of Gauss-Southwell and Lipschitz sampling options. Linear Equations Let m n, b R m, and A T = a... a m R n m with a i 2 = for all i. Furthermore, suppose that A T has full column rank, meaning that the linear system Aw = b has infinitely many solutions. To seek the least-length solution, we wish to solve The Lagrangian dual problem is minimize w R n 2 w 2 2 subject to Aw = b. minimize x R m fx := 2 AT x 2 2 b T x, where we note that fx = AA T x b and i fx = a T i A T x b i. The solutions to the primal and dual are related via w = A T x. Coordinate descent gives x k+ = x k αa T i A T x k b ie i. If we maintain an estimate w k = A T x k, then we see that w k+ = A T x k+ = A T x k αa T i A T x k b ie i = A T x k αa T i A T x k b ia i = w k αa T i w k b ia i. Note that if α =, then it follows by using a i 2 = that Linear Equations Summary Coordinate minimization for solving the dual problem associated with linear equations along the direction e i with α = satisfies the ith linear equation exactly. Sometimes called the method of successive projections. Update: w k+ = w k αa T i w k b ia i n + addition/subtractions 2n + multiplications 3n + 2 total floating-point operations Computing fx requires a multiplication with A, which is much more expensive. a T i w k+ = a T i w k a T i w k b ia i = a T i w k a T i w k b ia T i a i = b i so that the i-th equation is satisfied exactly.
11 Logistic Regression Give data {d j} N R n and labels { y j} N {, } associated with the data, solve minimize fx := N If we define the data matrix D such that then it follows that i fx = N D = N.. dt., dn T N Consider the coordinate minimization update log + e y jdj T x. e y jdj T x + e y jd j T yj dji. x x k+ = x k + αe ik for some i k {, 2,..., n} and α R. For efficiency, we store and update the required quantities {Dx k} using Dx k+ }{{} new value = Dx k + αe ik = Dx k + αde ik = Dx }{{} k +α D:, i k, old value where D:, i k denotes the i k-th column of D; if x 0 0, then we can set Dx 0 0. Logistic Regression Summary Coordinate minimization for the Logistic Regression problem does not require computing the entire gradient during every iteration. Update to obtain Dx k+ requires a single vector-vector add. Computing i fx k only requires accessing a single column of the data matrix D. Computing fx requires accessing the entire data matrix D. References I [] A. BECK AND L. TETRUASHVILI, On the convergence of block coordinate descent type methods, SIAM Journal on Optimization, , pp [2] D. LEVENTHAL AND A. S. LEWIS, Randomized methods for linear constraints: convergence rates and conditioning, Mathematics of Operations Research, , pp [3] Y. NESTEROV, Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM Journal on Optimization, , pp [4] J. NUTINI, M. SCHMIDT, I. H. LARADJI, M. FRIEDLANDER, AND H. KOEPKE, Coordinate descent converges faster with the gauss-southwell rule than random selection, in Proceedings of the 32nd International Conference on Machine Learning ICML-5, 205, pp [5] M. J. POWELL, On search directions for minimization algorithms, Mathematical programming, 4 973, pp [6] A. P. RUSZCZYŃSKI, Nonlinear optimization, vol. 3, Princeton university press, [7] S. J. WRIGHT, Coordinate descent algorithms, Mathematical Programming, 5 205, pp
Is Greedy Coordinate Descent a Terrible Algorithm?
Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random
More information1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016
AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex
More informationPart 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)
Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective
More informationExercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem.
Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Robert M. Gower. October 3, 07 Introduction This is an exercise in proving the convergence
More informationTrust Region Methods for Unconstrained Optimisation
Trust Region Methods for Unconstrained Optimisation Lecture 9, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Trust
More information6.896 Topics in Algorithmic Game Theory February 10, Lecture 3
6.896 Topics in Algorithmic Game Theory February 0, 200 Lecture 3 Lecturer: Constantinos Daskalakis Scribe: Pablo Azar, Anthony Kim In the previous lecture we saw that there always exists a Nash equilibrium
More informationGlobal convergence rate analysis of unconstrained optimization methods based on probabilistic models
Math. Program., Ser. A DOI 10.1007/s10107-017-1137-4 FULL LENGTH PAPER Global convergence rate analysis of unconstrained optimization methods based on probabilistic models C. Cartis 1 K. Scheinberg 2 Received:
More informationAn adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity
An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity Coralia Cartis, Nick Gould and Philippe Toint Department of Mathematics,
More informationLecture Quantitative Finance Spring Term 2015
implied Lecture Quantitative Finance Spring Term 2015 : May 7, 2015 1 / 28 implied 1 implied 2 / 28 Motivation and setup implied the goal of this chapter is to treat the implied which requires an algorithm
More informationThe Correlation Smile Recovery
Fortis Bank Equity & Credit Derivatives Quantitative Research The Correlation Smile Recovery E. Vandenbrande, A. Vandendorpe, Y. Nesterov, P. Van Dooren draft version : March 2, 2009 1 Introduction Pricing
More informationApproximate Composite Minimization: Convergence Rates and Examples
ISMP 2018 - Bordeaux Approximate Composite Minimization: Convergence Rates and S. Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi MLO Lab, EPFL, Switzerland sebastian.stich@epfl.ch July 4, 2018
More informationConvergence of trust-region methods based on probabilistic models
Convergence of trust-region methods based on probabilistic models A. S. Bandeira K. Scheinberg L. N. Vicente October 24, 2013 Abstract In this paper we consider the use of probabilistic or random models
More informationStochastic Proximal Algorithms with Applications to Online Image Recovery
1/24 Stochastic Proximal Algorithms with Applications to Online Image Recovery Patrick Louis Combettes 1 and Jean-Christophe Pesquet 2 1 Mathematics Department, North Carolina State University, Raleigh,
More informationDRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics
Chapter 12 American Put Option Recall that the American option has strike K and maturity T and gives the holder the right to exercise at any time in [0, T ]. The American option is not straightforward
More informationWhat can we do with numerical optimization?
Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016
More informationLecture 7: Bayesian approach to MAB - Gittins index
Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach
More informationFirst-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016
First-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) First-Order Methods IMA, August 2016 1 / 48 Smooth
More informationSupport Vector Machines: Training with Stochastic Gradient Descent
Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM
More informationOPTIMAL PORTFOLIO CONTROL WITH TRADING STRATEGIES OF FINITE
Proceedings of the 44th IEEE Conference on Decision and Control, and the European Control Conference 005 Seville, Spain, December 1-15, 005 WeA11.6 OPTIMAL PORTFOLIO CONTROL WITH TRADING STRATEGIES OF
More informationFinancial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs
Financial Optimization ISE 347/447 Lecture 15 Dr. Ted Ralphs ISE 347/447 Lecture 15 1 Reading for This Lecture C&T Chapter 12 ISE 347/447 Lecture 15 2 Stock Market Indices A stock market index is a statistic
More informationIEOR E4004: Introduction to OR: Deterministic Models
IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the
More informationCS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee
CS 3331 Numerical Methods Lecture 2: Functions of One Variable Cherung Lee Outline Introduction Solving nonlinear equations: find x such that f(x ) = 0. Binary search methods: (Bisection, regula falsi)
More informationGLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS
GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. In this paper we prove global
More informationInterpolation. 1 What is interpolation? 2 Why are we interested in this?
Interpolation 1 What is interpolation? For a certain function f (x we know only the values y 1 = f (x 1,,y n = f (x n For a point x different from x 1,,x n we would then like to approximate f ( x using
More informationChapter 7: Portfolio Theory
Chapter 7: Portfolio Theory 1. Introduction 2. Portfolio Basics 3. The Feasible Set 4. Portfolio Selection Rules 5. The Efficient Frontier 6. Indifference Curves 7. The Two-Asset Portfolio 8. Unrestriceted
More informationA Trust Region Algorithm for Heterogeneous Multiobjective Optimization
A Trust Region Algorithm for Heterogeneous Multiobjective Optimization Jana Thomann and Gabriele Eichfelder 8.0.018 Abstract This paper presents a new trust region method for multiobjective heterogeneous
More informationSy D. Friedman. August 28, 2001
0 # and Inner Models Sy D. Friedman August 28, 2001 In this paper we examine the cardinal structure of inner models that satisfy GCH but do not contain 0 #. We show, assuming that 0 # exists, that such
More informationHandout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems
SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 8: Introduction to Stochastic Dynamic Programming Instructor: Shiqian Ma March 10, 2014 Suggested Reading: Chapter 1 of Bertsekas,
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May, 204 Review of Game heory: Let M be a matrix with all elements in [0, ]. Mindy (called the row player) chooses
More informationEE/AA 578 Univ. of Washington, Fall Homework 8
EE/AA 578 Univ. of Washington, Fall 2016 Homework 8 1. Multi-label SVM. The basic Support Vector Machine (SVM) described in the lecture (and textbook) is used for classification of data with two labels.
More informationMartingale Pricing Theory in Discrete-Time and Discrete-Space Models
IEOR E4707: Foundations of Financial Engineering c 206 by Martin Haugh Martingale Pricing Theory in Discrete-Time and Discrete-Space Models These notes develop the theory of martingale pricing in a discrete-time,
More informationAdaptive cubic overestimation methods for unconstrained optimization
Report no. NA-07/20 Adaptive cubic overestimation methods for unconstrained optimization Coralia Cartis School of Mathematics, University of Edinburgh, The King s Buildings, Edinburgh, EH9 3JZ, Scotland,
More informationAdaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity
Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Coralia Cartis,, Nicholas I. M. Gould, and Philippe L. Toint September
More informationDepartment of Mathematics. Mathematics of Financial Derivatives
Department of Mathematics MA408 Mathematics of Financial Derivatives Thursday 15th January, 2009 2pm 4pm Duration: 2 hours Attempt THREE questions MA408 Page 1 of 5 1. (a) Suppose 0 < E 1 < E 3 and E 2
More informationTHE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE
THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE GÜNTER ROTE Abstract. A salesperson wants to visit each of n objects that move on a line at given constant speeds in the shortest possible time,
More informationEllipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE364b, Stanford University
Ellipsoid Method ellipsoid method convergence proof inequality constraints feasibility problems Prof. S. Boyd, EE364b, Stanford University Ellipsoid method developed by Shor, Nemirovsky, Yudin in 1970s
More informationConvergence Analysis of Monte Carlo Calibration of Financial Market Models
Analysis of Monte Carlo Calibration of Financial Market Models Christoph Käbe Universität Trier Workshop on PDE Constrained Optimization of Certain and Uncertain Processes June 03, 2009 Monte Carlo Calibration
More informationarxiv: v1 [math.pr] 6 Apr 2015
Analysis of the Optimal Resource Allocation for a Tandem Queueing System arxiv:1504.01248v1 [math.pr] 6 Apr 2015 Liu Zaiming, Chen Gang, Wu Jinbiao School of Mathematics and Statistics, Central South University,
More informationDecomposition Methods
Decomposition Methods separable problems, complicating variables primal decomposition dual decomposition complicating constraints general decomposition structures Prof. S. Boyd, EE364b, Stanford University
More informationMAT 4250: Lecture 1 Eric Chung
1 MAT 4250: Lecture 1 Eric Chung 2Chapter 1: Impartial Combinatorial Games 3 Combinatorial games Combinatorial games are two-person games with perfect information and no chance moves, and with a win-or-lose
More information25 Increasing and Decreasing Functions
- 25 Increasing and Decreasing Functions It is useful in mathematics to define whether a function is increasing or decreasing. In this section we will use the differential of a function to determine this
More informationChapter 7 One-Dimensional Search Methods
Chapter 7 One-Dimensional Search Methods An Introduction to Optimization Spring, 2014 1 Wei-Ta Chu Golden Section Search! Determine the minimizer of a function over a closed interval, say. The only assumption
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationDASC: A DECOMPOSITION ALGORITHM FOR MULTISTAGE STOCHASTIC PROGRAMS WITH STRONGLY CONVEX COST FUNCTIONS
DASC: A DECOMPOSITION ALGORITHM FOR MULTISTAGE STOCHASTIC PROGRAMS WITH STRONGLY CONVEX COST FUNCTIONS Vincent Guigues School of Applied Mathematics, FGV Praia de Botafogo, Rio de Janeiro, Brazil vguigues@fgv.br
More informationA No-Arbitrage Theorem for Uncertain Stock Model
Fuzzy Optim Decis Making manuscript No (will be inserted by the editor) A No-Arbitrage Theorem for Uncertain Stock Model Kai Yao Received: date / Accepted: date Abstract Stock model is used to describe
More informationSublinear Time Algorithms Oct 19, Lecture 1
0368.416701 Sublinear Time Algorithms Oct 19, 2009 Lecturer: Ronitt Rubinfeld Lecture 1 Scribe: Daniel Shahaf 1 Sublinear-time algorithms: motivation Twenty years ago, there was practically no investigation
More informationPortfolio Management and Optimal Execution via Convex Optimization
Portfolio Management and Optimal Execution via Convex Optimization Enzo Busseti Stanford University April 9th, 2018 Problems portfolio management choose trades with optimization minimize risk, maximize
More informationRevenue Management Under the Markov Chain Choice Model
Revenue Management Under the Markov Chain Choice Model Jacob B. Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jbf232@cornell.edu Huseyin
More informationCSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems
CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems January 26, 2018 1 / 24 Basic information All information is available in the syllabus
More informationHIGH ORDER DISCONTINUOUS GALERKIN METHODS FOR 1D PARABOLIC EQUATIONS. Ahmet İzmirlioğlu. BS, University of Pittsburgh, 2004
HIGH ORDER DISCONTINUOUS GALERKIN METHODS FOR D PARABOLIC EQUATIONS by Ahmet İzmirlioğlu BS, University of Pittsburgh, 24 Submitted to the Graduate Faculty of Art and Sciences in partial fulfillment of
More informationTechnical Report Doc ID: TR April-2009 (Last revised: 02-June-2009)
Technical Report Doc ID: TR-1-2009. 14-April-2009 (Last revised: 02-June-2009) The homogeneous selfdual model algorithm for linear optimization. Author: Erling D. Andersen In this white paper we present
More information4: SINGLE-PERIOD MARKET MODELS
4: SINGLE-PERIOD MARKET MODELS Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 4: Single-Period Market Models 1 / 87 General Single-Period
More informationPart 1: q Theory and Irreversible Investment
Part 1: q Theory and Irreversible Investment Goal: Endogenize firm characteristics and risk. Value/growth Size Leverage New issues,... This lecture: q theory of investment Irreversible investment and real
More informationEvaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization
Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint October 30, 200; Revised March 30, 20 Abstract
More informationGame Theory: Normal Form Games
Game Theory: Normal Form Games Michael Levet June 23, 2016 1 Introduction Game Theory is a mathematical field that studies how rational agents make decisions in both competitive and cooperative situations.
More informationLecture 8: Asset pricing
BURNABY SIMON FRASER UNIVERSITY BRITISH COLUMBIA Paul Klein Office: WMC 3635 Phone: (778) 782-9391 Email: paul klein 2@sfu.ca URL: http://paulklein.ca/newsite/teaching/483.php Economics 483 Advanced Topics
More informationTutorial 4 - Pigouvian Taxes and Pollution Permits II. Corrections
Johannes Emmerling Natural resources and environmental economics, TSE Tutorial 4 - Pigouvian Taxes and Pollution Permits II Corrections Q 1: Write the environmental agency problem as a constrained minimization
More informationOptimal Allocation of Policy Limits and Deductibles
Optimal Allocation of Policy Limits and Deductibles Ka Chun Cheung Email: kccheung@math.ucalgary.ca Tel: +1-403-2108697 Fax: +1-403-2825150 Department of Mathematics and Statistics, University of Calgary,
More informationSteepest descent and conjugate gradient methods with variable preconditioning
Ilya Lashuk and Andrew Knyazev 1 Steepest descent and conjugate gradient methods with variable preconditioning Ilya Lashuk (the speaker) and Andrew Knyazev Department of Mathematics and Center for Computational
More informationRichardson Extrapolation Techniques for the Pricing of American-style Options
Richardson Extrapolation Techniques for the Pricing of American-style Options June 1, 2005 Abstract Richardson Extrapolation Techniques for the Pricing of American-style Options In this paper we re-examine
More informationLecture 17: More on Markov Decision Processes. Reinforcement learning
Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture
More informationIntro to Economic analysis
Intro to Economic analysis Alberto Bisin - NYU 1 The Consumer Problem Consider an agent choosing her consumption of goods 1 and 2 for a given budget. This is the workhorse of microeconomic theory. (Notice
More informationNo-arbitrage theorem for multi-factor uncertain stock model with floating interest rate
Fuzzy Optim Decis Making 217 16:221 234 DOI 117/s17-16-9246-8 No-arbitrage theorem for multi-factor uncertain stock model with floating interest rate Xiaoyu Ji 1 Hua Ke 2 Published online: 17 May 216 Springer
More informationChapter 5 Finite Difference Methods. Math6911 W07, HM Zhu
Chapter 5 Finite Difference Methods Math69 W07, HM Zhu References. Chapters 5 and 9, Brandimarte. Section 7.8, Hull 3. Chapter 7, Numerical analysis, Burden and Faires Outline Finite difference (FD) approximation
More informationMarkowitz portfolio theory
Markowitz portfolio theory Farhad Amu, Marcus Millegård February 9, 2009 1 Introduction Optimizing a portfolio is a major area in nance. The objective is to maximize the yield and simultaneously minimize
More informationChapter 5 Portfolio. O. Afonso, P. B. Vasconcelos. Computational Economics: a concise introduction
Chapter 5 Portfolio O. Afonso, P. B. Vasconcelos Computational Economics: a concise introduction O. Afonso, P. B. Vasconcelos Computational Economics 1 / 22 Overview 1 Introduction 2 Economic model 3 Numerical
More informationProbability. An intro for calculus students P= Figure 1: A normal integral
Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided
More informationEssays on Some Combinatorial Optimization Problems with Interval Data
Essays on Some Combinatorial Optimization Problems with Interval Data a thesis submitted to the department of industrial engineering and the institute of engineering and sciences of bilkent university
More informationOrder book resilience, price manipulations, and the positive portfolio problem
Order book resilience, price manipulations, and the positive portfolio problem Alexander Schied Mannheim University PRisMa Workshop Vienna, September 28, 2009 Joint work with Aurélien Alfonsi and Alla
More informationOn Complexity of Multistage Stochastic Programs
On Complexity of Multistage Stochastic Programs Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA e-mail: ashapiro@isye.gatech.edu
More informationGPD-POT and GEV block maxima
Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD,
More informationPORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA
PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA We begin by describing the problem at hand which motivates our results. Suppose that we have n financial instruments at hand,
More informationLecture 19: March 20
CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 19: March 0 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may
More informationThe Probabilistic Method - Probabilistic Techniques. Lecture 7: Martingales
The Probabilistic Method - Probabilistic Techniques Lecture 7: Martingales Sotiris Nikoletseas Associate Professor Computer Engineering and Informatics Department 2015-2016 Sotiris Nikoletseas, Associate
More informationApplication of an Interval Backward Finite Difference Method for Solving the One-Dimensional Heat Conduction Problem
Application of an Interval Backward Finite Difference Method for Solving the One-Dimensional Heat Conduction Problem Malgorzata A. Jankowska 1, Andrzej Marciniak 2 and Tomasz Hoffmann 2 1 Poznan University
More informationYao s Minimax Principle
Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,
More informationInfinite Reload Options: Pricing and Analysis
Infinite Reload Options: Pricing and Analysis A. C. Bélanger P. A. Forsyth April 27, 2006 Abstract Infinite reload options allow the user to exercise his reload right as often as he chooses during the
More informationI R TECHNICAL RESEARCH REPORT. A Framework for Mixed Estimation of Hidden Markov Models. by S. Dey, S. Marcus T.R
TECHNICAL RESEARCH REPORT A Framework for Mixed Estimation of Hidden Markov Models by S. Dey, S. Marcus T.R. 98-31 I R INSTITUTE FOR SYSTEMS RESEARCH ISR develops, applies and teaches advanced methodologies
More informationLarge-Scale SVM Optimization: Taking a Machine Learning Perspective
Large-Scale SVM Optimization: Taking a Machine Learning Perspective Shai Shalev-Shwartz Toyota Technological Institute at Chicago Joint work with Nati Srebro Talk at NEC Labs, Princeton, August, 2008 Shai
More informationMengdi Wang. July 3rd, Laboratory for Information and Decision Systems, M.I.T.
Practice July 3rd, 2012 Laboratory for Information and Decision Systems, M.I.T. 1 2 Infinite-Horizon DP Minimize over policies the objective cost function J π (x 0 ) = lim N E w k,k=0,1,... DP π = {µ 0,µ
More informationLecture 8: Introduction to asset pricing
THE UNIVERSITY OF SOUTHAMPTON Paul Klein Office: Murray Building, 3005 Email: p.klein@soton.ac.uk URL: http://paulklein.se Economics 3010 Topics in Macroeconomics 3 Autumn 2010 Lecture 8: Introduction
More informationPricing Problems under the Markov Chain Choice Model
Pricing Problems under the Markov Chain Choice Model James Dong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jd748@cornell.edu A. Serdar Simsek
More informationDefinition 4.1. In a stochastic process T is called a stopping time if you can tell when it happens.
102 OPTIMAL STOPPING TIME 4. Optimal Stopping Time 4.1. Definitions. On the first day I explained the basic problem using one example in the book. On the second day I explained how the solution to the
More information4 Reinforcement Learning Basic Algorithms
Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems
More informationNotes on the EM Algorithm Michael Collins, September 24th 2005
Notes on the EM Algorithm Michael Collins, September 24th 2005 1 Hidden Markov Models A hidden Markov model (N, Σ, Θ) consists of the following elements: N is a positive integer specifying the number of
More informationOptimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008
(presentation follows Thomas Ferguson s and Applications) November 6, 2008 1 / 35 Contents: Introduction Problems Markov Models Monotone Stopping Problems Summary 2 / 35 The Secretary problem You have
More informationWeek 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals
Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :
More informationForecast Horizons for Production Planning with Stochastic Demand
Forecast Horizons for Production Planning with Stochastic Demand Alfredo Garcia and Robert L. Smith Department of Industrial and Operations Engineering Universityof Michigan, Ann Arbor MI 48109 December
More informationA Stochastic Approximation Algorithm for Making Pricing Decisions in Network Revenue Management Problems
A Stochastic Approximation Algorithm for Making ricing Decisions in Network Revenue Management roblems Sumit Kunnumkal Indian School of Business, Gachibowli, Hyderabad, 500032, India sumit kunnumkal@isb.edu
More informationLecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.
Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional
More informationOn the complexity of the steepest-descent with exact linesearches
On the complexity of the steepest-descent with exact linesearches Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint 9 September 22 Abstract The worst-case complexity of the steepest-descent algorithm
More informationValuation of performance-dependent options in a Black- Scholes framework
Valuation of performance-dependent options in a Black- Scholes framework Thomas Gerstner, Markus Holtz Institut für Numerische Simulation, Universität Bonn, Germany Ralf Korn Fachbereich Mathematik, TU
More informationAllocation of Risk Capital via Intra-Firm Trading
Allocation of Risk Capital via Intra-Firm Trading Sean Hilden Department of Mathematical Sciences Carnegie Mellon University December 5, 2005 References 1. Artzner, Delbaen, Eber, Heath: Coherent Measures
More informationThe Optimization Process: An example of portfolio optimization
ISyE 6669: Deterministic Optimization The Optimization Process: An example of portfolio optimization Shabbir Ahmed Fall 2002 1 Introduction Optimization can be roughly defined as a quantitative approach
More informationEC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods
EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions
More informationHomework Assignments
Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)
More information16 MAKING SIMPLE DECISIONS
247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result
More informationLecture l(x) 1. (1) x X
Lecture 14 Agenda for the lecture Kraft s inequality Shannon codes The relation H(X) L u (X) = L p (X) H(X) + 1 14.1 Kraft s inequality While the definition of prefix-free codes is intuitively clear, we
More informationEcon 582 Nonlinear Regression
Econ 582 Nonlinear Regression Eric Zivot June 3, 2013 Nonlinear Regression In linear regression models = x 0 β (1 )( 1) + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β it is assumed that the regression
More informationMTH6154 Financial Mathematics I Interest Rates and Present Value Analysis
16 MTH6154 Financial Mathematics I Interest Rates and Present Value Analysis Contents 2 Interest Rates 16 2.1 Definitions.................................... 16 2.1.1 Rate of Return..............................
More information