An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity

Size: px
Start display at page:

Download "An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity"

Transcription

1 An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity Coralia Cartis, Nick Gould and Philippe Toint Department of Mathematics, University of Namur, Belgium ( philippe.toint@fundp.ac.be ) Buenos-Aires, IFIP, July 2009

2 Cubic regularization for unconstrained problems The problem We consider the unconstrained nonlinear programming problem: minimize f (x) for x IR n and f : IR n IR smooth. Important special case: the nonlinear least-squares problem minimize f (x) = 1 F (x) 2 2 for x IR n and F : IR n IR m smooth. Philippe Toint (Namur) July / 26

3 Cubic regularization for unconstrained problems A useful observation Note the following: if f has gradient g and globally Lipschitz continuous Hessian H with constant 2L Taylor, Cauchy-Schwarz and Lipschitz imply f (x + s) = f (x) + s, g(x) + 1 s, H(x)s (1 α) s, [H(x + αs) H(x)]s dα f (x) + s, g(x) s, H(x)s L s 3 2 }{{} m(s) = reducing m from s = 0 improves f since m(0) = f (x). Philippe Toint (Namur) July / 26

4 Cubic regularization for unconstrained problems The cubic regularization Change from trust-regions: min s f (x) + s, g(x) + 1 s, H(x)s s.t. s 2 to cubic regularization: min s f (x) + s, g(x) s, H(x)s σ s 3 σ is the (adaptive) regularization parameter (ideas from Griewank, Weiser/Deuflhard/Erdmann, Nesterov/Polyak, Cartis/Gould/T) Philippe Toint (Namur) July / 26

5 Cubic regularization for unconstrained problems Cubic regularization highlights f (x + s) m(s) f (x) + s T g(x) st H(x)s L s 3 2 Nesterov and Polyak minimize m globally and exactly N.B. m may be non-convex! efficient scheme to do so if H has sparse factors global (ultimately rapid) convergence to a 2nd-order critical point of f better worst-case function-evaluation complexity than previously known Obvious questions: can we avoid the global Lipschitz requirement? can we approximately minimize m and retain good worst-case function-evaluation complexity? does this work well in practice? Philippe Toint (Namur) July / 26

6 Cubic regularization for unconstrained problems Cubic overestimation Assume Use f C 2 f, g and H at x k are f k, g k and H k symmetric approximation B k to H k B k and H k bounded at points of interest cubic overestimating model at x k m k (s) f k + s T g k st B k s σ k s 3 2 σ k is the iteration-dependent regularisation weight easily generalized for regularisation in M k -norm s Mk where M k is uniformly positive definite = s T M k s Philippe Toint (Namur) July / 26

7 Cubic regularization for unconstrained problems Adaptive Regularization with Cubic (ARC) Algorithm 1.1: The ARC Algorithm Step 0: Initialization: x 0 and σ 0 > 0 given. Set k = 0 Step 1: Step computation: Compute s k for which m k (s k ) m k (s C k ) Cauchy point: s C k = αc k g k & α C k = arg min α IR + m k ( αg k ) Step 2: Step acceptance: Compute ρ k = f (x k) f (x k + s k ) { f (x k ) m k (s k ) xk + s and set x k+1 = k if ρ k > 0.1 x k otherwise Step 3: Update the regularization parameter: σ k+1 (0, σ k ] = 1σ 2 k if ρ k > 0.9 very successful [σ k, γ 1 σ k ] = σ k if 0.1 ρ k 0.9 successful [γ 1 σ k, γ 2 σ k ] = 2σ k otherwise unsuccessful Philippe Toint (Namur) July / 26

8 Cubic regularization for unconstrained problems Local convergence theory for cubic regularization (1) The Cauchy condition: m k (x k ) m k (x k + s k ) κ CR g k min g k 1 + H k, g k σ k The bound on the stepsize: s k 3 max H k σ k, g k σ k (Cartis/Gould/T) Philippe Toint (Namur) July / 26

9 Cubic regularization for unconstrained problems Local convergence theory for cubic regularization (2) And therefore... lim g k = 0 k Under stronger assumptions can show that first-order global convergence If s k minimizes m k over subspace with orthogonal basis Q k, lim k QT k H k Q k 0 second-order global convergence Philippe Toint (Namur) July / 26

10 Cubic regularization for unconstrained problems Fast convergence For fast asymptotic convergence = need to improve on Cauchy point: minimize over Krylov subspaces g stopping-rule: s m k (s k ) min(1, g k 1 2 ) g k s stopping-rule: s m k (s k ) min(1, s k ) g k If B k satisfies the Dennis-Moré condition (B k H k )s k / s k 0 whenever g k 0 and x k x with positive definite H(x ) = Q-superlinear convergence of x k under the g- and s-rules If additionally H(x) is locally Lipschitz around x and (B k H k )s k = O( s k 2 ) = Q-quadratic convergence of x k under the s-rule Philippe Toint (Namur) July / 26

11 Cubic regularization for unconstrained problems Function-evaluation complexity How many function evaluations (iterations) are needed to ensure that g k ɛ? so long as for very successful iterations σ k+1 γ 3 σ k for γ 3 < 1 = basic ARC algorithm requires at most κc ɛ 2 function evaluations for some κ C independent of ɛ c.f. steepest descent if H is globally Lipschitz, the s-rule is applied and additionally s k is the global (line) minimizer of m k (αs k ) as a function of α = ARC algorithm requires at most κs ɛ 3/2 for some κ S independent of ɛ function evaluations c.f. Nesterov & Polyak Philippe Toint (Namur) July / 26

12 Cubic regularization for unconstrained problems Minimizing the model m(s) f + s T g st Bs σ s 3 2 Small problems: use Moré-Sorensen-like method with modified secular equation (also OK as long as factorization is feasible) Large problems: an iterative Krylov space method approximate solution Numerically sound procedures for computing exact/approximate steps Philippe Toint (Namur) July / 26

13 Cubic regularization for unconstrained problems The main features of adaptive cubic regularization And the result is... longer steps on ill-conditioned problems similar (very satisfactory) convergence analysis best function-evaluation complexity for nonconvex problems excellent performance and reliability Philippe Toint (Namur) July / 26

14 Cubic regularization for unconstrained problems Numerical experience small problems using Matlab 1 Performance Profile: iteration count 131 CUTEr problems 0.9 fraction of problems for which method within α of best ACO g stopping rule (3 failures) ACO s stopping rule (3 failures) trust region (8 failures) α Philippe Toint (Namur) July / 26

15 Regularization techniques for constrained problems The constrained case Can we apply regularization to the constrained case? Consider the constrained nonlinear programming problem: minimize f (x) x F for x IR n and f : IR n IR smooth, and where F is convex. Main ideas: exploit (cheap) projections on convex sets define using the generalized Cauchy point idea prove global convergence + function-evaluation complexity Philippe Toint (Namur) July / 26

16 Regularization techniques for constrained problems Constrained step computation (1) min s subject to f (x) + s, g(x) s, H(x)s σ s 3 x + s F σ is the (adaptive) regularization parameter Criticality measure: (as before) χ(x) def = min xf (x), d x+d F, d 1, Philippe Toint (Namur) July / 26

17 Regularization techniques for constrained problems The generalized Cauchy point for ARC Cauchy step: Goldstein-like piecewise linear seach on m k along the gradient path projected onto F Find such that x GC k = P F [x k tk GC g k] def = x k + sk GC (t GC k > 0) m k (x GC k ) f (x k) + κ ubs g k, s GC k (below linear approximation) and either m k (x GC k ) f (x k) + κ lbs g k, s GC k (above linear approximation) or P T (x GC k ) [ g k] κ epp g k, sk GC no trust-region condition! (close to path s end) Philippe Toint (Namur) July / 26

18 Regularization techniques for constrained problems Searching for the ARC-GCP m k (0 + s) = 3.57s 1 1.5s 2 s 3 + s 1 s 2 + 3s s 2s 3 2s s 3 such that s 1.5 Philippe Toint (Namur) July / 26

19 Regularization techniques for constrained problems A constrained regularized algorithm Algorithm 2.1: ARC for Convex Constraints (COCARC) Step 0: Initialization. x 0 F, σ 0 given. Compute f (x 0 ), set k = 0. Step 1: Generalized Cauchy point. If x k not critical, find the generalized Cauchy point xk GC by piecewise linear search on the regularized cubic model. Step 2: Step calculation. Compute s k and x + def k = x k + s k F such that m k (x + k ) m k(xk GC). Step 3: Acceptance of the trial point. Compute f (x + k ) and ρ k. If ρ k η 1, then x k+1 = x k + s k ; otherwise x k+1 = x k. Step 4: Regularisation parameter update. Set (0, σ k ] if ρ k η 2, σ k+1 [σ k, γ 1 σ k ] if ρ k [η 1, η 2 ), [γ 1 σ k, γ 2 σ k ] if ρ k < η 1. Philippe Toint (Namur) July / 26

20 Regularization techniques for constrained problems Local convergence theory for COCARC The Cauchy condition: [ m k (x k ) m k (x k + s k ) κ CR χ k min χ k 1 + H k, ] χk, 1 σ k The bound on the stepsize: s k 3 max [ H k σ k, ( ) 1 χk σ k 2, ( χk σ k ) 1 ] 3 And therefore... lim χ k = 0 k (Cartis/Gould/T) Philippe Toint (Namur) July / 26

21 Regularization techniques for constrained problems Function-Evaluation Complexity for COCARC (1) But What about function-evaluation complexity? If, for very successful iterations, σ k+1 γ 3 σ k for γ 3 < 1, the COCARC algorithm requires at most κc ɛ 2 function evaluations (for some κ C independent of ɛ) to achieve χ k ɛ c.f. steepest descent Do the nicer bounds for unconstrained optimization extend to the constrained case? Philippe Toint (Namur) July / 26

22 Regularization techniques for constrained problems Function-evaluation complexity for COCARC (2) As for unconstrained, impose a termination rule on the subproblem solution: Do not terminate solving min xk +s F m k (x k + s) before where χ m k χ m k (x + k ) min(κ stop, s k ) χ k (x) def = min xm k (x), d x+d F, d 1 Note: OK at local constrained model minimizers c.f. the s-rule for unconstrained Philippe Toint (Namur) July / 26

23 Regularization techniques for constrained problems Walking through the pass x k α g k 0 + x min x k feasible A beyond the pass constrained problem with m(x, y) = x y 3 10 x y [x 2 + y 2 ] 3 2 Philippe Toint (Namur) July / 26

24 Regularization techniques for constrained problems Walking through the pass...with a sherpa 2 1 x k,c x k,a x k α g k 0 x k + x k feasible A piecewise descent path from x k to x + k on m(x, y) = x y 3 10 x y [x 2 + y 2 ] 3 2 Philippe Toint (Namur) July / 26

25 Regularization techniques for constrained problems Function-Evaluation Complexity for COCARC (2) Assume also x k x + k in a bounded number of feasible descent substeps H k xx f (x k ) κ s k 2 xx f ( ) is globally Lipschitz continuous {x k } bounded The COCARC algorithm requires at most κc ɛ 3/2 function evaluations (for some κ C independent of ɛ) to achieve χ k ɛ Caveat: cost of solving the subproblem c.f. unconstrained case!!! Philippe Toint (Namur) July / 26

26 Conclusions Conclusions Much left to do... but very interesting Meaningful numerical evaluation still needed for many of these algorithms Many issues regarding regularizations still unresolved Many thanks for your attention! Philippe Toint (Namur) July / 26

Adaptive cubic overestimation methods for unconstrained optimization

Adaptive cubic overestimation methods for unconstrained optimization Report no. NA-07/20 Adaptive cubic overestimation methods for unconstrained optimization Coralia Cartis School of Mathematics, University of Edinburgh, The King s Buildings, Edinburgh, EH9 3JZ, Scotland,

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Coralia Cartis,, Nicholas I. M. Gould, and Philippe L. Toint September

More information

Trust Region Methods for Unconstrained Optimisation

Trust Region Methods for Unconstrained Optimisation Trust Region Methods for Unconstrained Optimisation Lecture 9, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Trust

More information

Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization

Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint October 30, 200; Revised March 30, 20 Abstract

More information

University of Edinburgh, Edinburgh EH9 3JZ, United Kingdom.

University of Edinburgh, Edinburgh EH9 3JZ, United Kingdom. An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity by C. Cartis 1, N. I. M. Gould 2 and Ph. L. Toint 3 February 20, 2009;

More information

On the complexity of the steepest-descent with exact linesearches

On the complexity of the steepest-descent with exact linesearches On the complexity of the steepest-descent with exact linesearches Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint 9 September 22 Abstract The worst-case complexity of the steepest-descent algorithm

More information

On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization

On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization C. Cartis, N. I. M. Gould and Ph. L. Toint 22 September 2011 Abstract The (optimal) function/gradient

More information

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models Math. Program., Ser. A DOI 10.1007/s10107-017-1137-4 FULL LENGTH PAPER Global convergence rate analysis of unconstrained optimization methods based on probabilistic models C. Cartis 1 K. Scheinberg 2 Received:

More information

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018 Universal regularization methods varying the power, the smoothness and the accuracy arxiv:1811.07057v1 [math.oc] 16 Nov 2018 Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint Revision completed

More information

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models Worst-case evaluation comlexity for unconstrained nonlinear otimization using high-order regularized models E. G. Birgin, J. L. Gardenghi, J. M. Martínez, S. A. Santos and Ph. L. Toint 2 Aril 26 Abstract

More information

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization A Trust Region Algorithm for Heterogeneous Multiobjective Optimization Jana Thomann and Gabriele Eichfelder 8.0.018 Abstract This paper presents a new trust region method for multiobjective heterogeneous

More information

Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization

Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization C. Cartis, N. I. M. Gould and Ph. L. Toint 11th November, 2014 Abstract In a recent paper (Cartis

More information

Convergence of trust-region methods based on probabilistic models

Convergence of trust-region methods based on probabilistic models Convergence of trust-region methods based on probabilistic models A. S. Bandeira K. Scheinberg L. N. Vicente October 24, 2013 Abstract In this paper we consider the use of probabilistic or random models

More information

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. In this paper we prove global

More information

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0. Outline Coordinate Minimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University November 27, 208 Introduction 2 Algorithms Cyclic order with exact minimization

More information

Is Greedy Coordinate Descent a Terrible Algorithm?

Is Greedy Coordinate Descent a Terrible Algorithm? Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random

More information

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex

More information

What can we do with numerical optimization?

What can we do with numerical optimization? Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016

More information

First-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016

First-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016 First-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) First-Order Methods IMA, August 2016 1 / 48 Smooth

More information

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem.

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Robert M. Gower. October 3, 07 Introduction This is an exercise in proving the convergence

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Report no. NA-07/09 Nonlinear programming without a penalty function or a filter Nicholas I. M. Gould Oxford University, Numerical Analysis Group Philippe L. Toint Department of Mathematics, FUNDP-University

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Math. Program., Ser. A (2010) 122:155 196 DOI 10.1007/s10107-008-0244-7 FULL LENGTH PAPER Nonlinear programming without a penalty function or a filter N. I. M. Gould Ph.L.Toint Received: 11 December 2007

More information

Chapter 5 Portfolio. O. Afonso, P. B. Vasconcelos. Computational Economics: a concise introduction

Chapter 5 Portfolio. O. Afonso, P. B. Vasconcelos. Computational Economics: a concise introduction Chapter 5 Portfolio O. Afonso, P. B. Vasconcelos Computational Economics: a concise introduction O. Afonso, P. B. Vasconcelos Computational Economics 1 / 22 Overview 1 Introduction 2 Economic model 3 Numerical

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Nonlinear programming without a penalty function or a filter N I M Gould Ph L Toint October 1, 2007 RAL-TR-2007-016 c Science and Technology Facilities Council Enquires about copyright, reproduction and

More information

Portfolio Management and Optimal Execution via Convex Optimization

Portfolio Management and Optimal Execution via Convex Optimization Portfolio Management and Optimal Execution via Convex Optimization Enzo Busseti Stanford University April 9th, 2018 Problems portfolio management choose trades with optimization minimize risk, maximize

More information

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee CS 3331 Numerical Methods Lecture 2: Functions of One Variable Cherung Lee Outline Introduction Solving nonlinear equations: find x such that f(x ) = 0. Binary search methods: (Bisection, regula falsi)

More information

Portfolio selection with multiple risk measures

Portfolio selection with multiple risk measures Portfolio selection with multiple risk measures Garud Iyengar Columbia University Industrial Engineering and Operations Research Joint work with Carlos Abad Outline Portfolio selection and risk measures

More information

Technical Report Doc ID: TR April-2009 (Last revised: 02-June-2009)

Technical Report Doc ID: TR April-2009 (Last revised: 02-June-2009) Technical Report Doc ID: TR-1-2009. 14-April-2009 (Last revised: 02-June-2009) The homogeneous selfdual model algorithm for linear optimization. Author: Erling D. Andersen In this white paper we present

More information

arxiv: v3 [cs.lg] 1 Jul 2017

arxiv: v3 [cs.lg] 1 Jul 2017 Jonas Moritz Kohler 1 Aurelien Lucchi 1 arxiv:1705.05933v3 [cs.lg] 1 Jul 2017 Abstract We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus

More information

Convergence Analysis of Monte Carlo Calibration of Financial Market Models

Convergence Analysis of Monte Carlo Calibration of Financial Market Models Analysis of Monte Carlo Calibration of Financial Market Models Christoph Käbe Universität Trier Workshop on PDE Constrained Optimization of Certain and Uncertain Processes June 03, 2009 Monte Carlo Calibration

More information

BOUNDS FOR THE LEAST SQUARES RESIDUAL USING SCALED TOTAL LEAST SQUARES

BOUNDS FOR THE LEAST SQUARES RESIDUAL USING SCALED TOTAL LEAST SQUARES BOUNDS FOR THE LEAST SQUARES RESIDUAL USING SCALED TOTAL LEAST SQUARES Christopher C. Paige School of Computer Science, McGill University Montreal, Quebec, Canada, H3A 2A7 paige@cs.mcgill.ca Zdeněk Strakoš

More information

Calibration Lecture 1: Background and Parametric Models

Calibration Lecture 1: Background and Parametric Models Calibration Lecture 1: Background and Parametric Models March 2016 Motivation What is calibration? Derivative pricing models depend on parameters: Black-Scholes σ, interest rate r, Heston reversion speed

More information

Steepest descent and conjugate gradient methods with variable preconditioning

Steepest descent and conjugate gradient methods with variable preconditioning Ilya Lashuk and Andrew Knyazev 1 Steepest descent and conjugate gradient methods with variable preconditioning Ilya Lashuk (the speaker) and Andrew Knyazev Department of Mathematics and Center for Computational

More information

A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation

A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation E Bergou Y Diouane V Kungurtsev C W Royer July 5, 08 Abstract Globally convergent variants of the Gauss-Newton

More information

Machine Learning (CSE 446): Learning as Minimizing Loss

Machine Learning (CSE 446): Learning as Minimizing Loss Machine Learning (CSE 446): Learning as Minimizing Loss oah Smith c 207 University of Washington nasmith@cs.washington.edu October 23, 207 / 2 Sorry! o office hour for me today. Wednesday is as usual.

More information

Penalty Functions. The Premise Quadratic Loss Problems and Solutions

Penalty Functions. The Premise Quadratic Loss Problems and Solutions Penalty Functions The Premise Quadratic Loss Problems and Solutions The Premise You may have noticed that the addition of constraints to an optimization problem has the effect of making it much more difficult.

More information

Keywords: evaluation complexity, worst-case analysis, least-squares, constrained nonlinear optimization, cubic regularization methods.

Keywords: evaluation complexity, worst-case analysis, least-squares, constrained nonlinear optimization, cubic regularization methods. On the evaluation complexity of cubic regularization methos for potentially rank-icient nonlinear least-squares problems an its relevance to constraine nonlinear optimization Coralia Cartis, Nicholas I.

More information

Sample Path Large Deviations and Optimal Importance Sampling for Stochastic Volatility Models

Sample Path Large Deviations and Optimal Importance Sampling for Stochastic Volatility Models Sample Path Large Deviations and Optimal Importance Sampling for Stochastic Volatility Models Scott Robertson Carnegie Mellon University scottrob@andrew.cmu.edu http://www.math.cmu.edu/users/scottrob June

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Statistics and Machine Learning Homework1

Statistics and Machine Learning Homework1 Statistics and Machine Learning Homework1 Yuh-Jye Lee National Taiwan University of Science and Technology dmlab1.csie.ntust.edu.tw/leepage/index c.htm Exercise 1: (a) Solve 1 min x R 2 2 xt 1 0 0 900

More information

Approximate Composite Minimization: Convergence Rates and Examples

Approximate Composite Minimization: Convergence Rates and Examples ISMP 2018 - Bordeaux Approximate Composite Minimization: Convergence Rates and S. Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi MLO Lab, EPFL, Switzerland sebastian.stich@epfl.ch July 4, 2018

More information

Strategies for Improving the Efficiency of Monte-Carlo Methods

Strategies for Improving the Efficiency of Monte-Carlo Methods Strategies for Improving the Efficiency of Monte-Carlo Methods Paul J. Atzberger General comments or corrections should be sent to: paulatz@cims.nyu.edu Introduction The Monte-Carlo method is a useful

More information

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization for Strongly Convex Stochastic Optimization Microsoft Research New England NIPS 2011 Optimization Workshop Stochastic Convex Optimization Setting Goal: Optimize convex function F ( ) over convex domain

More information

Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients

Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients Worst-case evaluation comlexity of regularization methods for smooth unconstrained otimization using Hölder continuous gradients C Cartis N I M Gould and Ph L Toint 26 June 205 Abstract The worst-case

More information

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems January 26, 2018 1 / 24 Basic information All information is available in the syllabus

More information

Fast Convergence of Regress-later Series Estimators

Fast Convergence of Regress-later Series Estimators Fast Convergence of Regress-later Series Estimators New Thinking in Finance, London Eric Beutner, Antoon Pelsser, Janina Schweizer Maastricht University & Kleynen Consultants 12 February 2014 Beutner Pelsser

More information

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs Financial Optimization ISE 347/447 Lecture 15 Dr. Ted Ralphs ISE 347/447 Lecture 15 1 Reading for This Lecture C&T Chapter 12 ISE 347/447 Lecture 15 2 Stock Market Indices A stock market index is a statistic

More information

Large-Scale SVM Optimization: Taking a Machine Learning Perspective

Large-Scale SVM Optimization: Taking a Machine Learning Perspective Large-Scale SVM Optimization: Taking a Machine Learning Perspective Shai Shalev-Shwartz Toyota Technological Institute at Chicago Joint work with Nati Srebro Talk at NEC Labs, Princeton, August, 2008 Shai

More information

On the Superlinear Local Convergence of a Filter-SQP Method. Stefan Ulbrich Zentrum Mathematik Technische Universität München München, Germany

On the Superlinear Local Convergence of a Filter-SQP Method. Stefan Ulbrich Zentrum Mathematik Technische Universität München München, Germany On the Superlinear Local Convergence of a Filter-SQP Method Stefan Ulbrich Zentrum Mathemati Technische Universität München München, Germany Technical Report, October 2002. Mathematical Programming manuscript

More information

Sensitivity Analysis with Data Tables. 10% annual interest now =$110 one year later. 10% annual interest now =$121 one year later

Sensitivity Analysis with Data Tables. 10% annual interest now =$110 one year later. 10% annual interest now =$121 one year later Sensitivity Analysis with Data Tables Time Value of Money: A Special kind of Trade-Off: $100 @ 10% annual interest now =$110 one year later $110 @ 10% annual interest now =$121 one year later $100 @ 10%

More information

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions

More information

Applied Mathematical Sciences, Vol. 8, 2014, no. 1, 1-12 HIKARI Ltd,

Applied Mathematical Sciences, Vol. 8, 2014, no. 1, 1-12 HIKARI Ltd, Applied Mathematical Sciences, Vol. 8, 2014, no. 1, 1-12 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.35258 Improving the Robustness of Difference of Convex Algorithm in the Research

More information

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Catalyst Acceleration for Gradient-Based Non-Convex Optimization Catalyst Acceleration for Gradient-Based Non-Convex Optimization Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui To cite this version: Courtney Paquette, Hongzhou Lin,

More information

Decomposition Methods

Decomposition Methods Decomposition Methods separable problems, complicating variables primal decomposition dual decomposition complicating constraints general decomposition structures Prof. S. Boyd, EE364b, Stanford University

More information

Variable-Number Sample-Path Optimization

Variable-Number Sample-Path Optimization Noname manuscript No. (will be inserted by the editor Geng Deng Michael C. Ferris Variable-Number Sample-Path Optimization the date of receipt and acceptance should be inserted later Abstract The sample-path

More information

Contents Critique 26. portfolio optimization 32

Contents Critique 26. portfolio optimization 32 Contents Preface vii 1 Financial problems and numerical methods 3 1.1 MATLAB environment 4 1.1.1 Why MATLAB? 5 1.2 Fixed-income securities: analysis and portfolio immunization 6 1.2.1 Basic valuation of

More information

The Correlation Smile Recovery

The Correlation Smile Recovery Fortis Bank Equity & Credit Derivatives Quantitative Research The Correlation Smile Recovery E. Vandenbrande, A. Vandendorpe, Y. Nesterov, P. Van Dooren draft version : March 2, 2009 1 Introduction Pricing

More information

Stochastic Approximation Algorithms and Applications

Stochastic Approximation Algorithms and Applications Harold J. Kushner G. George Yin Stochastic Approximation Algorithms and Applications With 24 Figures Springer Contents Preface and Introduction xiii 1 Introduction: Applications and Issues 1 1.0 Outline

More information

Support Vector Machines: Training with Stochastic Gradient Descent

Support Vector Machines: Training with Stochastic Gradient Descent Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM

More information

Worst-case-expectation approach to optimization under uncertainty

Worst-case-expectation approach to optimization under uncertainty Worst-case-expectation approach to optimization under uncertainty Wajdi Tekaya Joint research with Alexander Shapiro, Murilo Pereira Soares and Joari Paulo da Costa : Cambridge Systems Associates; : Georgia

More information

Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India

Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India Presented at OSL workshop, Les Houches, France. Joint work with Prateek Jain, Sham M. Kakade, Rahul Kidambi and Aaron Sidford Linear

More information

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE 6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE Rollout algorithms Cost improvement property Discrete deterministic problems Approximations of rollout algorithms Discretization of continuous time

More information

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Prof. Chuan-Ju Wang Department of Computer Science University of Taipei Joint work with Prof. Ming-Yang Kao March 28, 2014

More information

On a Manufacturing Capacity Problem in High-Tech Industry

On a Manufacturing Capacity Problem in High-Tech Industry Applied Mathematical Sciences, Vol. 11, 217, no. 2, 975-983 HIKARI Ltd, www.m-hikari.com https://doi.org/1.12988/ams.217.7275 On a Manufacturing Capacity Problem in High-Tech Industry Luca Grosset and

More information

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming

Stochastic Dual Dynamic Programming Algorithm for Multistage Stochastic Programming Stochastic Dual Dynamic Programg Algorithm for Multistage Stochastic Programg Final presentation ISyE 8813 Fall 2011 Guido Lagos Wajdi Tekaya Georgia Institute of Technology November 30, 2011 Multistage

More information

The Irrevocable Multi-Armed Bandit Problem

The Irrevocable Multi-Armed Bandit Problem The Irrevocable Multi-Armed Bandit Problem Ritesh Madan Qualcomm-Flarion Technologies May 27, 2009 Joint work with Vivek Farias (MIT) 2 Multi-Armed Bandit Problem n arms, where each arm i is a Markov Decision

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

Stochastic Programming and Financial Analysis IE447. Midterm Review. Dr. Ted Ralphs

Stochastic Programming and Financial Analysis IE447. Midterm Review. Dr. Ted Ralphs Stochastic Programming and Financial Analysis IE447 Midterm Review Dr. Ted Ralphs IE447 Midterm Review 1 Forming a Mathematical Programming Model The general form of a mathematical programming model is:

More information

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints David Laibson 9/11/2014 Outline: 1. Precautionary savings motives 2. Liquidity constraints 3. Application: Numerical solution

More information

Estimating Maximum Smoothness and Maximum. Flatness Forward Rate Curve

Estimating Maximum Smoothness and Maximum. Flatness Forward Rate Curve Estimating Maximum Smoothness and Maximum Flatness Forward Rate Curve Lim Kian Guan & Qin Xiao 1 January 21, 22 1 Both authors are from the National University of Singapore, Centre for Financial Engineering.

More information

Chapter 7 One-Dimensional Search Methods

Chapter 7 One-Dimensional Search Methods Chapter 7 One-Dimensional Search Methods An Introduction to Optimization Spring, 2014 1 Wei-Ta Chu Golden Section Search! Determine the minimizer of a function over a closed interval, say. The only assumption

More information

SMOOTH CONVEX APPROXIMATION AND ITS APPLICATIONS SHI SHENGYUAN. (B.Sc.(Hons.), ECNU)

SMOOTH CONVEX APPROXIMATION AND ITS APPLICATIONS SHI SHENGYUAN. (B.Sc.(Hons.), ECNU) SMOOTH CONVEX APPROXIMATION AND ITS APPLICATIONS SHI SHENGYUAN (B.Sc.(Hons.), ECNU) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF MATHEMATICS NATIONAL UNIVERSITY OF SINGAPORE 2004

More information

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE364b, Stanford University

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE364b, Stanford University Ellipsoid Method ellipsoid method convergence proof inequality constraints feasibility problems Prof. S. Boyd, EE364b, Stanford University Ellipsoid method developed by Shor, Nemirovsky, Yudin in 1970s

More information

A homotopy algorithm for the quantile regression lasso and related piecewise linear problems

A homotopy algorithm for the quantile regression lasso and related piecewise linear problems A homotopy algorithm for the quantile regression lasso and related piecewise linear problems M.R. Osborne B.A. urlach June 8, 200 Abstract We show that the homotopy algorithm of Osborne, Presnell and urlach

More information

Characterization of the Optimum

Characterization of the Optimum ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing

More information

Jing Gao 1 and Jian Cao 2* 1 Introduction

Jing Gao 1 and Jian Cao 2* 1 Introduction Gao and Cao Journal of Inequalities and Applications (018) 018:108 https://doi.org/10.1186/s13660-018-1698-7 R E S E A R C H Open Access A class of derivative-free trust-region methods with interior bactracing

More information

A Technique for Choosing an E ective Hedging Portfolio with Few Instruments

A Technique for Choosing an E ective Hedging Portfolio with Few Instruments A Technique for Choosing an E ective Hedging Portfolio with Few Instruments by Varuna Manevannan An essay presented to the University of Waterloo in partial fulfillment of the requirement for the degree

More information

Log-Robust Portfolio Management

Log-Robust Portfolio Management Log-Robust Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Elcin Cetinkaya and Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983 Dr.

More information

Budget Management In GSP (2018)

Budget Management In GSP (2018) Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning

More information

Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach

Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach Estimating Macroeconomic Models of Financial Crises: An Endogenous Regime-Switching Approach Gianluca Benigno 1 Andrew Foerster 2 Christopher Otrok 3 Alessandro Rebucci 4 1 London School of Economics and

More information

Parameterized Expectations

Parameterized Expectations Parameterized Expectations A Brief Introduction Craig Burnside Duke University November 2006 Craig Burnside (Duke University) Parameterized Expectations November 2006 1 / 10 Parameterized Expectations

More information

FUNCIONAMIENTO DEL ALGORITMO DEL PCR: EUPHEMIA

FUNCIONAMIENTO DEL ALGORITMO DEL PCR: EUPHEMIA FUNCIONAMIENTO DEL ALGORITMO DEL PCR: EUPHEMIA 09-04-2013 INTRODUCTION PCR can have two functions: For Power Exchanges: Most competitive price will arise & Overall welfare increases Isolated Markets Price

More information

Stable Local Volatility Function Calibration Using Spline Kernel

Stable Local Volatility Function Calibration Using Spline Kernel Stable Local Volatility Function Calibration Using Spline Kernel Thomas F. Coleman Yuying Li Cheng Wang January 25, 213 Abstract We propose an optimization formulation using the l 1 norm to ensure accuracy

More information

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective

Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective Idiosyncratic risk, insurance, and aggregate consumption dynamics: a likelihood perspective Alisdair McKay Boston University June 2013 Microeconomic evidence on insurance - Consumption responds to idiosyncratic

More information

Dynamic Macroeconomics: Problem Set 2

Dynamic Macroeconomics: Problem Set 2 Dynamic Macroeconomics: Problem Set 2 Universität Siegen Dynamic Macroeconomics 1 / 26 1 Two period model - Problem 1 2 Two period model with borrowing constraint - Problem 2 Dynamic Macroeconomics 2 /

More information

Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger

Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger Due Date: Friday, December 12th Instructions: In the final project you are to apply the numerical methods developed in the

More information

Multi-period Portfolio Choice and Bayesian Dynamic Models

Multi-period Portfolio Choice and Bayesian Dynamic Models Multi-period Portfolio Choice and Bayesian Dynamic Models Petter Kolm and Gordon Ritter Courant Institute, NYU Paper appeared in Risk Magazine, Feb. 25 (2015) issue Working paper version: papers.ssrn.com/sol3/papers.cfm?abstract_id=2472768

More information

A model reduction approach to numerical inversion for parabolic partial differential equations

A model reduction approach to numerical inversion for parabolic partial differential equations A model reduction approach to numerical inversion for parabolic partial differential equations Liliana Borcea Alexander V. Mamonov 2, Vladimir Druskin 2, Mikhail Zaslavsky 2 University of Michigan, Ann

More information

Continuing Education Course #287 Engineering Methods in Microsoft Excel Part 2: Applied Optimization

Continuing Education Course #287 Engineering Methods in Microsoft Excel Part 2: Applied Optimization 1 of 6 Continuing Education Course #287 Engineering Methods in Microsoft Excel Part 2: Applied Optimization 1. Which of the following is NOT an element of an optimization formulation? a. Objective function

More information

A model reduction approach to numerical inversion for parabolic partial differential equations

A model reduction approach to numerical inversion for parabolic partial differential equations A model reduction approach to numerical inversion for parabolic partial differential equations Liliana Borcea Alexander V. Mamonov 2, Vladimir Druskin 3, Mikhail Zaslavsky 3 University of Michigan, Ann

More information

On two homogeneous self-dual approaches to. linear programming and its extensions

On two homogeneous self-dual approaches to. linear programming and its extensions Mathematical Programming manuscript No. (will be inserted by the editor) Shinji Mizuno Michael J. Todd On two homogeneous self-dual approaches to linear programming and its extensions Received: date /

More information

Lecture Quantitative Finance Spring Term 2015

Lecture Quantitative Finance Spring Term 2015 implied Lecture Quantitative Finance Spring Term 2015 : May 7, 2015 1 / 28 implied 1 implied 2 / 28 Motivation and setup implied the goal of this chapter is to treat the implied which requires an algorithm

More information

Risk Management for Chemical Supply Chain Planning under Uncertainty

Risk Management for Chemical Supply Chain Planning under Uncertainty for Chemical Supply Chain Planning under Uncertainty Fengqi You and Ignacio E. Grossmann Dept. of Chemical Engineering, Carnegie Mellon University John M. Wassick The Dow Chemical Company Introduction

More information

Another Look at Normal Approximations in Cryptanalysis

Another Look at Normal Approximations in Cryptanalysis Another Look at Normal Approximations in Cryptanalysis Palash Sarkar (Based on joint work with Subhabrata Samajder) Indian Statistical Institute palash@isical.ac.in INDOCRYPT 2015 IISc Bengaluru 8 th December

More information

1 Rare event simulation and importance sampling

1 Rare event simulation and importance sampling Copyright c 2007 by Karl Sigman 1 Rare event simulation and importance sampling Suppose we wish to use Monte Carlo simulation to estimate a probability p = P (A) when the event A is rare (e.g., when p

More information

arxiv: v1 [math.st] 6 Jun 2014

arxiv: v1 [math.st] 6 Jun 2014 Strong noise estimation in cubic splines A. Dermoune a, A. El Kaabouchi b arxiv:1406.1629v1 [math.st] 6 Jun 2014 a Laboratoire Paul Painlevé, USTL-UMR-CNRS 8524. UFR de Mathématiques, Bât. M2, 59655 Villeneuve

More information

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey By Klaus D Schmidt Lehrstuhl für Versicherungsmathematik Technische Universität Dresden Abstract The present paper provides

More information

Partitioned Analysis of Coupled Systems

Partitioned Analysis of Coupled Systems Partitioned Analysis of Coupled Systems Hermann G. Matthies, Rainer Niekamp, Jan Steindorf Technische Universität Braunschweig Brunswick, Germany wire@tu-bs.de http://www.wire.tu-bs.de Coupled Problems

More information

Overnight Index Rate: Model, calibration and simulation

Overnight Index Rate: Model, calibration and simulation Research Article Overnight Index Rate: Model, calibration and simulation Olga Yashkir and Yuri Yashkir Cogent Economics & Finance (2014), 2: 936955 Page 1 of 11 Research Article Overnight Index Rate: Model,

More information