What can we do with numerical optimization?

Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (1 : 21) Acknowledgement This mini-course is supported by an initiation grant from STINT, the Swedish Foundation for International Cooperation in Research and Higher Education Handout version of the slides presented during the lectures can be downloaded from http://www8.cs.umu.se/~eddiew/optpde2016/ Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (2 : 21) What can we do with numerical optimization? Operations research Allocation of resources for industrial production Finding logistics, scheduling, or transportation solutions Crew scheduling for airline cabin personel Managing investment portfolios Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (3 : 21)

What can we do with numerical optimization? Inverse problems Estimate material parameters from measurements Oil exploration; exploit data to obtain subsurface images Medical tomography Non-destructive testing Estimate initial conditions for numerical weather models from weather data Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (4 : 21) What can we do with numerical optimization? Engineering Design Which shape is the best? Which material composition is the right one?... Example: Optimization of a cantilever beam. Use 50 % material while minimizing the compliance of the beam Ω D f Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (5 : 21) Course objectives for todays lecture Become familiar with the language of optimization Seeing examples of practical problems formulated as optimization problems Introduce important classes of optimization problems Terminology Characterization of solutions Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (6 : 21)

Course objectives Introduce important numerical methods to solve optimization problems Newton-type methods for unconstrained optimization Feasible-point and barrier/penalty methods for nonlinear programming Obtaining hands-on experience with software Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (7 : 21) A general optimization problem Minimize f (x) among all x such that g i (x) = 0 i = 1, 2,..., n eq h i (x) 0 i = 1, 2,..., n ineq l i x i u i for some i s where f, g i, h i are functions from R n to R Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (8 : 21) Terminology f x g i (x) = 0 h i (x) 0 l i x i u i objective function descision variables equality constraints inequality constraints box constraints (special case of inequality constraints) An x is feasible if the constraints are satisfied at x. Otherwise x is infeasible. The feasible set S consist of all x that satisfies all constraints. Using S the general problem can be written as min x S f (x). Assume that x is feasible for an inequality constraint h i (x) 0 The constraint is active (or binding) if h i (x) = 0 Otherwise the constraint is inactive (nonbinding, or slack). A problem is unconstrained if there are no constrains. Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (9 : 21)

Global and local minimizers S R n f (x) Global minimizer: x S such that f (x ) f (x) Global x S S minimizer: x S such that, for some ɛ > 0, f (x ) f (x) x S B(x ; ɛ), where B(x ; ɛ) is a ball of radius ɛ surrounding x, that is, (B(x ; ɛ) = {x R n x x ɛ}) Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (10 : 21) Convexity and optimality A convex function f on a convex set S: f is below or on the linear interpolant between any two points in the set αf (x 1) + (1 α)f (x 2) f (x) x 1 x 2 x 1, x 2 S f ( ) αx 1 + (1 α)x 2 αf (x1 ) + (1 α)f (x 2 ) α [0, 1] Examples of convex functions on R n (a) f (x) = c T x (b) f (x) = γ + c T x + 1 2 x T Qx (also concave!) (Q positive semidefinite matrix) Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (11 : 21) S R n Global minimizer: x S such that f (x ) f (x) x S f (x) minimizer: x S such that, for some ɛ > 0, f (x ) f (x) x S B(x ; ɛ) where B(x; ɛ): ball of radius ɛ centered at x Global S Theorem. For convex functions on convex sets holds that each local minimizer is a global minimizer Example: min f (x) such that x Ax b, where f (x) = c T x (linear program) or f (x) = γ+c T x+ 1 2 x T Qx, Q positive semidefinite (quadratic program) Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (12 : 21)

Unconstrained optimization Will start to consider unconstrained optimization min f (x) or, equivalently, Find x R n such that x R n f (x ) f (x) x R n Function f is nonlinear in x. Unconstrained optimization meaningless for linear f, since linear f on R n are unbounded or constant Most common application for unconstrained optimization: inverse problems, parameter estimation Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (13 : 21) Unconstrained optimization: example application At time intervals t 1, t 2,..., t m a physical process generates a time sequence of m observations (measurements) b 1, b 2,..., b m A model of the process says that b k b(t k ) where b(t) = x 1 + x 2 e x 3t + x 4 e x 5t Model not exact: measuring errors (noise), modeling errors We want to find the coefficients x 1,..., x 5 that best matches the observations Define the residual r k = b k (x 1 + x 2 e x 3t k + x 4 e x 5t k ) and solve the unconstrained optimization problem m min r x 1,...,x k 2 5 k=1 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (14 : 21) A general iterative algorithm for unconstrained optimization Problem: min f (x) x R n Many optimization algorithms are of the type 1. Specify an initial guess x 0 2. For k = 0, 1,... 2.1 If x k optimal stop 2.2 Determina a search direction p k and a step length α k 2.3 Set x k+1 = x k + p k α k For most problems, optimum cannot be reached within a finite number of steps Important: convergence rate, the behavior of x k x as k Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (15 : 21)

Convergence rate Definition The sequence {x k } k=1 converges to x with rate p and rate constant C < if x k x and x k+1 x lim k x k x p = C. Linear: p = 1 and 0 < C < 1. The error is essentially multiplied by C each iteration. Quadratic: p = 2. Roughly a doubling of the correct digits each iteration. Superlinear: p = 1 and C = 0. Faster than linear". Includes quadratic convergence but also "intermediaterates. Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (16 : 21) Positive definite matrices An n-by-n real matrix A is positive semidefinite if v T Av 0, v R n It is positive definite if v T Av > 0, v {x R n x 0} A positive definite matrix is nonsingular Matrix A is positive definite if and only if matrix A 1 is positive definite A symmetric matrix A is positive definite if and only if All eigenvalues are strictly positive A = LL T with L lower triangular and L ii > 0 (Cholesky factorization). Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (17 : 21) Quadratic functions Let φ : R n R be definied by φ(s) = αg T s + 1 2 st Hs, where α R, g and s are n-vectors, and H an n-by-n matrix. Theorem. If H is symmetric and positive definite, then the solution to is the unique minimizer of φ Hs = g Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (18 : 21)

Finding minima Finding global minimum: very hard! Reason: in general refers to all points minimum is also very hard to find if nothing more about f is known mimumum easier if f is differentiable, since gradients provide information about local behaviour of functions. Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (19 : 21) Taylor series with remainder term Let f : R n R be of class C 2 (twice continuously differentiable). Then x, y R n, f (y) = f (x) + (y x) T f (x) + 1 2 (y x)t 2 f (ξ)(y x), where ξ = αx + (1 α)y for some α [0, 1]. The gradient of f : The Hessian of f : f (x) = 2 f (x) = ( f,..., f ) T x 1 x n 2 f... 2 f x 1 2..... 2 f x n x 1... 2 f xn 2 x 1 x n Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (20 : 21) Optimality conditions for unconstrained optimization First order necessary condition: Assume f : R n R has a local minimum at x = x and that f is differentiable at x = x. Then f (x ) = 0. Second order necessary condition: Assume f : R n R has a local minimum at x = x and that f is of class C 2. Then 2 f (x ) is positive semidefinite. Second order suffcient condition: Assume f : R n R is of class C 2. If f (x ) = 0 and 2 f (x ) is positive definite then x is a local minimizer. Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016 (21 : 21)