Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients

Size: px
Start display at page:

Download "Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients"

Transcription

1 Worst-case evaluation comlexity of regularization methods for smooth unconstrained otimization using Hölder continuous gradients C Cartis N I M Gould and Ph L Toint 26 June 205 Abstract The worst-case behaviour of a general class of regularization algorithms is considered in the case where only objective function values and associated gradient vectors are evaluated Uer bounds are derived on the number of such evaluations that are needed for the algorithm to roduce an aroximate first-order critical oint whose accuracy is within a user-ined threshold The analysis covers the entire range of meaningful owers in the regularization term as well as in the Hölder exonent for the gradient The resulting comlexity bounds vary according to the regularization ower and the assumed Hölder exonent, recovering known results when available Introduction The comlexity analysis of algorithms for smooth, ossibly non-convex, unconstrained otimization has been the subject of a burgeoning literature over the ast few years see the contributions by Nesterov 2, 5, Gratton, Sartenaer and Toint, Cartis, Gould and Toint 3, 5, 6, 7, Ueda 7, Ueda and Yamashita8, 9, Graiglia, Yuan and Yuan 9, 0, and Vicente 20, for instance) The resent contribution belongs to this active trend and focuses on the analysis of the worst-case behaviour of regularization methods where only objective function values and associated gradient vectors are evaluated It rooses uer bounds on the number of such evaluations that are needed for the algorithm to roduce an aroximate first-order critical oint whose accuracy is within a user-ined threshold An analysis of this tye is already available for the case where the objective function s gradient is assumed to be Lischitz-continuous and where the regularization uses the second or third ower of the norm of the comuted ste at a given iteration see the aer by Nesterov 3 for the former and those of Cartis et al 5, 6 for both cases) The novelty of the resent aroach is to extend the analysis to cover roblems whose objective gradients are simly Hölder continuous and methods that allow weaker regularization than in the Lischitz case Mathematical Institute, Oxford University, Oxford OX2 6GG, Great Britain coraliacartis@mathsoxacuk Numerical Analysis Grou, Rutherford Aleton Laboratory, Chilton OX 0QX, Great Britain nickgould@stfcacuk Namur Center for Comlex Systems naxys) and Deartment of Mathematics, University of Namur, 6, rue de Bruxelles, B-5000 Namur, Belgium hilietoint@unamurbe

2 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 2 The resulting comlexity bounds vary according to the regularization ower and the assumed Hölder exonent, roviding a unified view and recovering known results when available The aer is organized as follows Section 2 resents the roblem and the class of algorithms considered The comlexity analysis itself is given in Section 3 and the shareness of the obtained result is discussed in Section 4 Section 5 finally rovides some comments of the results Notations: In what follows, denotes the Euclidean norm and the T suerscrit denotes transosition If v is a vector in IR n, v i denotes its i-th comonent 2 The roblem and algorithm We consider the roblem of finding an aroximate solution of the otimization roblem min x fx) 2) where x IR n is the vector of otimization variables and f is a function from IR n into IR that is assumed to be bounded below and continuously differentiable with Hölder continuous gradients If we denote gx) = x fx), the latter says that the inequality gx) gy) L x y 22) holds for all x, y IR n, where L 0 and > 0 are constants indeendent of x and y and where is the Euclidean norm on IR n As exlained in Lemma 3 below, we will assume, without loss of generality, that Problems involving functions with Hölder continuous gradients are interesting on their own rights, but can also be found in engineering ractice, such as in the design of gas ielines the Panhandle law which governs such flows states that the gas flow rate in a ieline is a ower between and 2 of the difference in squared ressures, see 6, Section 7, for instance) Such functions also aear in the solution of certain nonlinear PDE roblems see Bensoussan and Frehse ) In our context, an aroximate solution for roblem 2) is a vector x ɛ such that gx ɛ ) ɛ or fx) f target 23) where ɛ > 0 is a user-secified accuracy threshold and f target is a threshold value indeendent of ɛ under which the reduction of the objective function is deemed sufficient by the user The first case in 23) corresonds to finding an aroximate first-order-critical oint If a suitable value for f target is not known, minus infinity can be used instead, in effect making the second art of 23) imossible to satisfy and reducing this condition to its first art The class of regularization methods that we consider for comuting an x satisfying 23) consists of iterative algorithms where, at each iteration, a local linear or quadratic) model of f around the current iterate x k is constructed, regularized by a term using the -th ower of the norm of the ste, and then aroximately minimized in the Cauchy oint sense) to rovide a trial ste s k The quality of this ste is then measured in order to accet the resulting trial oint x k + s k as the next iterate, or to reject it and adjust the strength of the regularization More secifically, a regularized model of fx k + s) of the form m k x k + s) = fx k ) + g T k s + 2 st B k s + σ k s 24)

3 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 3 is considered around the k-th iterate x k, where we have ined g k = gx k ), where B k is a symmetric n n matrix, where σ k > 0 is the regularization arameter at iteration k and where > is the iteration indeendent) user-ined regularization ower In ractice, the matrix B k may be chosen to rovide suitable scaling of the variables if known), for instance using quasi-newton formulae The model 24) is then aroximately minimized in the sense that the trial ste s k is comuted such that m k x k + s k ) m k x k + s C k ), 25) where the Cauchy ste s C k is ined by s C k = αc k g k with α C k = arg min α 0 m kx k αg k ) 26) We will choose the regularization ower in 24) in order to guarantee that m k is bounded below and grows at infinity, thereby ensuring that 26) is well-ined In articular, this imoses the restriction > and furthermore > 2 whenever B k is allowed to not be ositive semi-inite 27) Notice that 25) and 26) together imly that m k x k + s k ) m k x k + s C k ) < fx k) 28) rovided gx k ) 0 We may now describe our class of algorithms more formally as Algorithm 2 on the following age Iterations of Algorithm 2 where ρ k η are called successful and their index set is denoted by S Note that the mechanism of the algorithm ensures that σ k > 0 for all k 0 Note also that each iteration of the algorithm involves a single evaluation of the objective function and for successful iterations only) of its gradient The evaluation comlexity can therefore be carried out by measuring how many iterations are needed before an aroximate first-order critical oint is found or the objective value decreases below the required target If = 2 or = 3, the model minimization occuring in Ste 2 of the algorithm is tyically easy to comute if one is hay with the minimum requirement that 25) and 26) hold: an efficient unidimensional linesearch technique using quadratic or cubic interolation is all that is needed Larger model decrease may be obtained by ursuing the minimization beyond the Cauchy oint, and again efficient algorithms are known for quadratic and cubic regularizations see Cartis et al 4 for the latter case, the former being the well known roblem of minimizing a quadratic function) Good methods are also available for more general values of in effect requiring the one-dimensional minimization of a -th order olynomial) : see Cartis et al 2 for the case of regularized least-norm roblems with general 2 or Gould, Robinson and Thorne 8 for even more general cases 3 Worst-case evaluation comlexity analysis In order to analyze the worst-case comlexity of Algorithm 2, we need to secify our assumtions

4 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 4 Algorithm 2: A Class of First-Order Adative Regularization Methods Ste 0: Initialization An initial oint x 0, a target objective function value f target fx 0 ) and an initial regularization arameter σ 0 > 0 are given, as well as an accuracy level ɛ The constants η, η 2, γ, γ 2 and γ 3 are also given and satisfy 0 < η η 2 < and 0 < γ < < γ 2 < γ 3 29) Comute fx 0 ) and set k = 0 Ste : Test for termination If g k ɛ or fx k ) f target, terminate with the aroximate solution x ɛ = x k Ste 2: Ste calculation Comute the ste s k aroximately by minimizing the model 24) in the sense that conditions 25) and 26) hold Ste 3: Accetance of the trial oint Comute fx k + s k ) and ine ρ k = fx k) fx k + s k ) m k x k ) m k x k + s k ) 20) If ρ k η, then ine x k+ = x k + s k and evaluate gx k+ ); otherwise ine x k+ = x k Ste 4: Regularization arameter udate Set γ σ k, σ k if ρ k η 2, σ k+ σ k, γ 2 σ k if ρ k η, η 2 ), γ 2 σ k, γ 3 σ k if ρ k < η 2) Increment k by one and go to Ste AS The objective function f is continuously differentiable on IR n AS2 g = x f is Hölder continuous in the sense that 22) holds for all x, y IR n and some constants L 0 and > 0 AS3 There exists a constant f low ossibly equal to minus infinity) such that, for all x IR n, fx) f low and f = maxf low, f target > AS4 There exists constants κ gl 0 and κ gu such that κ gl gx) κ gu for all x IR n such that f fx) fx 0 ) AS5 There exists a constant κ B 0 such that, for all k 0, B k κ B

5 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 5 AS and AS2 formalize our framework, as described in the introduction while AS5 is standard in similar contexts and avoids ossibly infinite curvature of the model, which would make the regularization irrelevant Note that the values of L 0 and > 0 are often unknown to the user AS3 states that, if no target value is secified by the user, then there must exist a global lower bound on the objective function s values to make the minimization roblem meaningful The role of AS4 is to take into account that, when f = f target > f low, it may well haen that no single x IR n satisfies both conditions in 23), and thus that the first termination criterion in 23) cannot be satisfied by our minimization algorithm before the second We take this ossibility into account by allowing κ gl > 0, and exresssing the comlexity results in terms of ɛ = max ɛ, κ gl 3) which is the attainable gradient accuracy for the roblem given f target For simlicity of exosition, we assume for now that ɛ <, but comment on the case ɛ at the end of the aer We note that AS4 automatically holds if the set {x IR n f fx) fx 0 )} is bounded, but also, as we discuss in Lemma 32 below, in the frequent situation where fx) is bounded below on the level set {x IR n fx) fx 0 )} We start by deriving consequences of our assumtions, which are indeendent of the algorithm The first is intended to exlore the consequence of a value of exceeding Lemma 3 Suose that AS holds and that AS2 holds for some > Then f is linear in IR n, AS2 holds for all > 0 with L = 0 and AS4 holds with κ gl = κ gu = gx 0 ) Proof If e i is the i-th vector of the canonical basis and gx) i the i-th comonent of the gradient at x, we have, using the Cauchy-Schwarz inequality and the Hölder condition 22), that, for all i =,, n and all x IR n, gx + te i ) i gx) i t gx + te i) gx) x + te i x L t and > 0 Taking the limit when t 0 gives that the directional derivative of each g ) i exists and is zero for all i and at all x Thus the gradient is constant in IR n, f is linear and AS2 obviously holds with L = 0 for all > 0 since gx) gy) is identically zero for all x, y IR n This justifies our choice to restrict our attention to the case where 0, for the rest of our analysis The second result indicates common circumstances in which AS4 holds Lemma 32 Suose that AS and AS2 hold, and that there exists a constant f low > such that fx) f low 32) for all x L 0 = {x IR n fx) fx 0 )} for all x IR n Then AS4 holds

6 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 6 Proof s, Let x L 0 AS, the mean-value theorem, and AS2 then ensure that, for all f low fx + s) fx) + gx) T s + 0 gx + ξs) gx)) T s dξ fx) + gx) T s + L s + = hs) + Given that the minimizer of the convex function hs) is given by 33) we obtain that min s s = gx) L / gx), hs) = hs ) = fx) L + gx) + As a consequence, we obtain, using the fact that fx) fx 0 ) since x L 0 and 33), that which in turn imlies that gx) f low fx 0 ) L + gx) +, L + ) + fx 0 ) f low) = κ gu, irresective of the value of f target This and the choice κ gl = 0 yield the desired conclusion Note that 32) is indeed very common For instance, f low = 0 for all nonlinear least-squares roblems Hence the form of AS4 should not be viewed as overly restrictive and also allows for the case where 32) fails but the objective function s gradient remains reasonably wellbehaved For instance, roblems whose objective function is an ininite quadratic are allowed rovided f target > We now turn to the analysis of the algorithm s roerties But, before we start in earnest, it is useful to introduce some secific notation In a number of occurrences, we need to include some of the terms in formulae only if certain conditions aly We will indicate this by underbracing the conditional art of the formula, the text below the underbrace then secifying the relevant condition For instance we may have an exression of the tye maxa }{{}, b, c, a>0 meaning that the maximum should include the first term if and only if a > 0 making the term well-ined in this case) We first derive two bounds of the ste length, generalizing Lemma 22 in 4

7 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 7 Lemma 33 We have that, for all k 0, s k max B k σ k ) 2 } {{ } B k 0 2, σ k g k ) 34) Moreover, rovided s k ) 2 g k σ k 35) σ k B k ) 36) 2 g k ) 2 Proof Observe first that 24), 28) and g k 0 ensure that m k x k + s k ) fx k ) = g T k s k + 2 st k B ks k + σ k s k < 0 37) Assume first that s T k B ks k > 0 Then we must have that g T k s k + σ k s k < 0, and therefore remembering that σ k > 0 and that g T k s k g k s k ) s k < ) g k σ k ) 2 < g k 38) σ k If s T k B ks k 0, we may rewrite 37) as gk T s k + σ k 2 s k + 2 st k B ks k + σ k 2 s k < 0 and the left-hand side of this inequality can only be negative if at least one of the bracketed exressions is negative, giving that ) ) 2 2 s k max B k, g k, σ k σ k where we also used that gk T s k g k s k and s T k B ks k B k s k 2 Combining this with 38) then yields 34) Checking 35) subject to 36) is straightforward We now turn to the task of finding a lower bound on the model decrease fx k ) m k x k + s k ) resulting from 25)-26) The first ste is to find a suitable ositive lower bound on the ste αk C as ined in 26)

8 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 8 Lemma 34 We have that m k x k + s C k ) m kx k α k g k) < fx k ) 39) where α k = min g k 2 2g T k B kg k }{{} g T k B kg k >0, 2σ k g k 2 ) 30) Proof Substituting the inition s = αg k into 24), we obtain from 25)-26) that, for all α > 0, m k x k αg k ) fx k ) = α g k αgt k B kg k + σ ) k α g k 3) Assume first that g T k B kg k 0 Then for all α 0, ᾱ k where g k 2 + σ k α g k < 0 ᾱ k = ) σ k g k 2 32) and, because α > 0 and g T k B kg k 0, we also obtain from 3) that m k x k αg k ) < fx k ) for all α 0, ᾱ k In articular, this yields that m k x k α k g k) < fx k ), where Condition 26) then ensures that 39) holds as desired Assume next that g T k B kg k > 0 and, in this case, ine α k = min ) αk = 33) 2σ k g k 2 g k 2 ) 2gk T B, kg k 2σ k g k 2 Then it is easy to verify that both bracketed exressions in 2 g k α k gt k B kg k + g 2 k 2 + σ k α k ) g k = ) αk m k x k αk g k) fx k ) are negative and thus, because αk > 0, that m kx k αk g k) < fx k ) The desired conclusion can now be obtained by invoking 26)

9 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 9 We now translate the conclusions of the last lemma in terms of the model reduction at the Cauchy oint and beyond, generalizing Lemma 2 in 4 Lemma 35 We have that fx k ) m k x k + s k ) 4 min g k 4 2gk T B, kg k }{{} gk T B kg k >0 2σ k g k ) 34) Proof If g T k B kg k 0, substituting 33) into 3) immediately yields that ) fx k ) m k x k αk g k) 2σ k g k 2 g k 2 g 2 k 2 = ) g k 2 2σ k 35) If gk T B kg k > 0, we have from 3) and 30) that fx k ) m k x k α k g k) α k = min g k 2 gk 2 ) 2 2gk T B gk T kg B kg k σ ) k k 2σ k g k 2 g k g k 2 ) 2gk T B, kg k 2σ k g k 2 g k 2 g 4 k 2 g 2 k 2 = 4 min g k 4 ) 2gk T B, g k kg k 2σ k Combining this last inequality with 35) and using 25) then gives 34) The model decrease secified by 34) turns out to be useful if the value of σ k aearing at the denominator of the second term in the min) can be bounded above across all iterations We obtain this result in two stages, the first being to determine conditions under which an iteration must be very successful

10 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 0 Lemma 36 Suose that AS, AS2 and AS5 hold Then ρ k η 2, iteration k is very successful and σ k+ σ k i) if + and where σ k κ g k + 36) ) L κ = 2, + ii) if + < and where σ k κ 2 max κ 2 = max 2 g k 2, g k + ) 2 κ B, 2 2+ κ3, 8 κ 3 37) 38) with κ 3 = L + + κ 4 2 B η 2 ) 39) Proof First notice that AS, the mean-value theorem and 24) imly that fx k + s k ) m k x k + s k ) = Using now AS2, we obtain that fx k + s k ) m k x k + s k ) 0 gx k + ξs k ) g k ) T s k dξ 2 st k B ks k σ k s k L + s k + 2 st k B ks k σ k s k 320) Assume first that + which imlies that B 0 because of 27)) Then fx k +s k ) m k x k + s k ) and thus ρ k > η 2 ) if which, in view of 34) and B k 0, holds if that is if σ k L + s k +, σ k L ) + 2 g k, + σ k σ k 2 roving the first item in the lemma s statement ) L gk +, 32) +

11 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions Assume now that > +, in which case B k is allowed to be ininite if > 2 and we cannot guarantee that s T k B ks k 0 in 320) Then ρ k η 2 if r k = fx k + s k ) m k x k + s k ) η 2 )fx k ) m k x k + s k )) < 0 Note that a lower bound on fx k ) m k x k +s k ) is given by Lemma 35 If we now assume that, whenever g T k B kg k > 0, σ k 2 2κ B) g k 2, 322) then we obtain that the minimum occurring in the right-hand side of 34) is achieved by the second term, yielding that fx k ) m k x k + s k ) 4 ) g k 2σ k As a consequence, we obtain from 320), the Cauchy-Schwarz inequality and AS5 that r k L + s k + + κ 2 B s k 2 η ) 2 g k 4 2σ k If we also assume that, whenever B k 0, 36) also holds, then we may substitute the uer bound 35) in this equation and obtain that r k < 0 if ) + ) 2 ) L 2 g k σ κ η 2 B g k < g k k σ k 4 2σ k Now, if, on one hand, then we obtain that r k < 0 if ) + 2 g k σ k ) 2 2 g k, 323) σ k ) ) + ) L κ η 2 2 B g k < g k σ k 4 2σ k Taking the )-th ower and rearranging, we obtain that r k < 0 if ) ) σ k 2 2+ L + + κ 4 2 B gk + 324) η 2 If, on the other hand, 323) fails, then r k < 0 if ) ) 2 ) L κ η 2 2 B g k < g k σ k 4 2σ k Once more taking the )-th ower and rearranging, we obtain that r k < 0 if ) ) L 4 σ k κ 2 B g k 2 325) η 2

12 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 2 Thus r k < 0 and therefore ρ k η 2 ) when > + rovided 324) and 325) hold together with 36) when B k 0) and 322) when gk T B kg k > 0) This roves the second item in the lemma s statement if we note that κ B ) 2) 2 = 2 2 κ B ) < 2 2 κ B ) and 2 2κ B) < 2 ) 2 κ B Note that the second art of the lemma extends the result of Lemma 3 in 5 to general and We are now in osition to rove an iteration-indeendent uer bound on the value of σ k Lemma 37 Suose that AS AS5 hold and that ɛ < Then, as long as the algorithm does not terminate, we have that, for all k 0, i) if +, where κ σ σ k κ σ, 326) = max γ 3 κ κ + gu, σ 0 327) ii) if + <, σ k max κ σ 2, κ σ 3 ɛ +, 328) where κ σ 2 = max 0, γ 3 κ 2 κgu 2, σ 0 }{{} 2 and κ σ 3 = γ 3 κ 2 329) with κ and κ 2 ined in 38) Proof We again distinguish two cases Assume first that +, which in turn imlies that, 2 and thus, in view of 27), that B k 0 for all k Then AS4 and condition Lemma 36 i) imly that σ k+ σ k rovided which is a constant indeendent of k and ɛ + σ k κ κgu, 330) The second case is when + < We first consider the subclass where 2 where, using AS4, g k 2 κ 2 gu 33) This bound, art ii) of Lemma 36 and the fact that g k > ɛ as long as the algorithm has not terminated then imly that σ k+ σ k rovided σ k κ 2 max κgu 2 }{{} 2, ɛ + 332)

13 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 3 where we have used that + < 0 Alternatively, if > 2, art ii) of Lemma 36 and the fact that g k > ɛ as long as the algorithm has not terminated then give that σ k+ σ k rovided σ k κ 2 max ɛ 2, ɛ + where the last equality now results from the fact that, because, + = κ 2 ɛ, 333) 0 > The roof of 326) and 328) is then comleted by taking into account that the initial arameter σ 0 may exceed the bound given by the right-hand side 330) if + ) or 332) if + < ), and also that these bounds may just fail by a small margin at an unsuccessful iteration, resulting in an increase of σ k by a factor γ 3 before the relevant bound alies Having now derived an iteration indeendent uer bound on σ k, we may return to the model decrease given by Lemma 35 Lemma 38 Suose that AS AS5 hold and that ɛ < algorithm does not terminate, Then, as long as the if +, then where if + <, then where fx k ) mx k + s k ) κ m ɛ 334) κ m = ) 4 min, 2κ B 2κ σ, 335) fx k ) mx k + s k ) κ m 2 ɛ + 336) κ m 2 = ) 4 min, 2κ B 2κ σ 337) 2 Proof Assume first that + As above, this imlies that, 2 and hence, because of 27), that gk T B kg k 0 Taking into account that, in this case, gk T B kg k κ B g k 2

14 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 4 because of AS5, substituting 326) into 34) and using 326) and the fact that g k ɛ as long as the algorithm has not terminated, yields that fx k ) m k x k + s k ) 4 min ɛ 2 ) 2κ, B 2κ σ ɛ ) ) 4 min 2κ, B 2κ σ min ɛ 2, ɛ and 334) follows since ɛ < and 2 for, 2 Consider now the case where + < Substituting now 328) into 34), using 328), AS5 and the fact that g k ɛ as long as the algorithm has not terminated, we obtain that fx k ) m k x k + s k ) 4 min ɛ 2 2κ B }{{} g T k B kg k >0 4 min 2κ, B, 2 max 2 max κ σ 2, κ σ 3 ɛ κ σ 2, κ σ 3 ɛ ) which yields 336) since ɛ < and, for + < and 0,, + min ɛ 2, ɛ, ɛ + ) + and + 2 We now recall an imortant technical lemma which, in effect, gives a bound on the total number of unsuccessful iterations before iteration k as a function of the number of successful ones Lemma 39 The mechanism of Algorithm 2 guarantees that, if for some σ max > 0, then k S k σ k σ max, 338) + log γ ) + log log γ 2 log γ 2 σmax σ 0 ) 339) Proof See 5

15 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 5 We are now ready to rove our main result on the worst-case comlexity of Algorithm 2 Theorem 30 Suose that AS AS5 hold and that ɛ ined in 3) satisfies ɛ < If +, there exist constants κ s, κ a and κ c such that, for any ɛ > 0, Algorithm 2 requires at most κ s fx 0 ) f 340) ɛ successful iterations and gradient evaluations),and a total of κ a fx 0 ) f + κ c ɛ 34) iterations and objective function evaluations) before roducing an iterate x ɛ such that gx ɛ ) ɛ or fx ɛ ) f target 2 If + <, there exist constants κ s, κa, κb and κc such that, for all ɛ > 0, Algorithm 2 requires at most fx 0 ) f 342) κ s ɛ + successful iterations and gradient evaluations) and a total of fx 0 ) f + κ b log ɛ + κ c 343) κ a ɛ + iterations and objective function evaluations) before roducing an iterate x ɛ such that gx ɛ ) ɛ or fx ɛ ) f target In the above statements the constants are given by and κ a κ s = κ s = + log γ η κ m log γ 2 κ a κ c = η κ m + log γ log γ 2 =, 344) η κ m ), κ c = κ σ log log γ 2 σ 0 ), 345) ), κ b = 346) log γ 2 = ) log max, κ σ 2, κ σ 3 ) + logσ 0 ), 347) log γ 2

16 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 6 where ) ) L κ = 2, κ2 = max 2 2 κ B, κ3, 8 κ 3 348) with and κ m κ σ κ 3 = = γ 3 κ κ + gu, κ σ 2 = γ 3 max = ) 4 min, 2κ B 2κ σ L + + κ 4 2 B η 2 and κ m 2 0, κ 2 κ 2 gu }{{} 2 ), κ σ 2, 349) = max γ 3 κ 2 350) = ) 4 min, 2κ B 2κ σ 2 35) Proof Consider first the case where + We then deduce from AS3, the inition of a successful iteration and 334) in Lemma 38, that, as long as the algorithm has not terminated, fx 0 ) f fx 0 ) fx k+ ) = fxj ) fx j + s j ) j S k η fxj ) m j x j + s j ) 352) j S k > η κ m ɛ S k, where S k is the cardinality of S k = {j S j k}, that is the number of successful iterations u to iteration k This rovides an uer bound on S k which is indeendent of k and ɛ, from which we obtain the bound 340) with 344) Calling now uon Lemma 39 and 326), we deduce that the total number of iterations and function evaluations) cannot exceed κ s fx 0 ) f ɛ which then gives the bound 34) with 345) + log γ ) + κ σ log log γ 2 log γ 2 The roof for the case where + < is derived in a manner entirely similar to that used for the case where +, relacing ɛ by ɛ + in 352) since 334) is used instead of 336), and also noting that, when using 328) instead of 326) in Lemma 39, max κ σ + 2, κσ 3 ɛ log σ 0 + σ 0 ), log ɛ + logmax, κ σ 2, κ σ 3 ) + logσ 0 )

17 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 7 We may thus deduce that 342) and 343) hold with 346) 35) A close look at the exressions of the constants in 344)-35) reveals that the global uer bound on the gradient norm, κ gu, only occurs in the case where < 2 Therefore, AS4 is only needed in this case since the existence of κ gl 0 is always ensured by the non-negativity of gx) 4 Sharness We now show that the bound secified by art ii) of Theorem 30 is essentially shar in the sense that we exhibit a class of one-dimensional examles where the number of iterations necessary to roduce an aroximate first-order critical oint is arbitrarily close to the theorem s bound ) To achieve this goal, we first establish sequences of iterates {x k }, function values {fx k )}, gradient values {g k } and regularization arameter values {σ k } which can be generated by Algorithm 2 and such that the gradient values converge to zero sufficiently slowly to attain the desired lower bound on the number of iterations and evaluations) Once these are ined, we construct a function fx) which interolates these function and gradient values and finally rove that all our assumtions are satisfied Because the derivation of the comlexity bound involves an increasing sequence of regularization arameters {σ k }, our examle is unfortunately somewhat comlicated because it has to include both successful and unsuccessful iterations We choose to construct it such that all even iterations are unsuccessful and all odd ones are successful Consider the gradient sequence ined, for > +, any arbitrarily small τ 0, ), a ositive integer q and all k 0, by ) + +τ g 2k =, g 2k+ = g 2k 4) k + q and observe that the sequence of gradient norms { g k } is non-increasing for any choice of q Assume first that q = This inition imlies that when k tends to infinity, and thus that ω 2k = g 2k+ g 2k+ + 2 g 2k+3 g 2k+3 g 2k+ 42) ) g 2k g 2k ) + ) 2 43) 3 Hence, there exists an integer l 2 such that ) 2 ) 5 ω 2k, 0, ) for k l 44) We now re)ine q in 4) by setting q = l, in effect shifting the {k} sequence by l such that 42)-44) holds with 4) for the comlete shifted sequence Note that q only deends ) Whether this can also be achieved for art i) of the theorem is still unknown at this oint

18 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 8 on and τ and is indeendent of ɛ Observe also that the rate of monotonic) convergence of the sequence {g k } to zero ensures that, for any ɛ 0, ), g k ɛ only for k larger than 2 ɛ + +τ+) q) In order to ensure the roer rate of increase of σ k, we choose to set σ 2k+ = g 2k+ + 45) for all k 0 remembering that odd iterations are successful), while the value of σ 2k is still to be determined within the constraints of 2) Associated with the sequence {g k }, we ine the sequence of iterates {x k } by x 0 = x = 0, x 2k+2 = x 2k+ + s 2k+ = x 2k+3 k 0) In this inition, the ste s 2k+ at a successful iterations is comuted by minimizing the model 24) with B 2k+ = 0, that is m 2k+ x 2k+ + s) = fx 2k+ ) + g 2k+ s + σ 2k+ s, over s, where the function value fx 2k+ ) is still to be ined A simle calculation shows that ) g2k+ s 2k+ = = g2k+ g 0 <, 46) σ 2k+ where we substituted 45) to obtain the last equality, and that m 2k+ = m 2k+ x 2k+ ) m 2k+ x 2k+ + s 2k+ ) = ) g 2k+ + ) = g 2k+ s 2k+ 47) Similarly, we also ine the ste s 2k as the minimizer of m 2k x 2k + s) with B 2k = 0, yielding g2k s 2k = σ 2k ) 48) and m 2k = m 2k x 2k ) m 2k x 2k + s 2k ) = The sequence of function values is then ined by ) g2k σ 2k ) = ) g 2k s 2k 49) fx 0 ) = fx ) = 0, fx 2k+2 ) = m 2k+ x 2k+ + s 2k+ ) = fx 2k+3 ) k 0), 40) where the second art guarantees the very successful nature of iteration 2k + We observe that, for k 0, fx 2k ) fx 2k+ ) = 0

19 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 9 since iteration 2k is unsuccessful, and yielding that, for every k 0, that fx 2k+ ) fx 2k+2 ) = m 2k+ = ) g 2k+ +, 4) fx 0 ) fx 2k+2 ) = = = k fx 2j+ ) fx 2j+2 ) j=0 ) k j=0 ) k j=0 g 2j+ + ) + + τ j + q Hence the sequence {fx k )} is bounded below by f = ) ζ + + ) τ q + j= j ) + + τ >, 42) where ζ ) is the Riemann zeta function We conclude the inition of the sequences involved in our examle by selecting σ 2k in order to imose that, for all k 0, s 2k = s 2k+ + 2 s 2k+3 43) where, ) is chosen as when ining q above Using 48), this is equivalent to asking 2 2 that g 2k = s 2k+ + σ s 2 2k+3), 2k which, in view of 45), is equivalent to requiring that ) σ 2k g 2k = σ 2k g 2k + s 2k+ + s 2 2k+3 If we now take 46), 43) and 44) into account, this amounts to imosing that ) σ 2k 2 ) 5 = ω 2k,, σ 2k therefore satisfying 2) at successful iterations for a choice of γ 2 ) 2 3 In order to start the recursion, we arbitrarily) ine σ by 45) with k = and g = /q ) + +τ ) We also observe that, for large enough k, σ 2k+ σ 2k = g + 2k+ = ω 2k σ 2k s2k+ + s ) 2 2k+3 s 2k+ s + s 2 s 3 ), ) ) and 2) therefore also holds at unsuccessful iterations As a consequence of this somewhat lengthy descrition, we may therefore deduce that the sequences {x k }, {g k } {σ k } and {fx k )}

20 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 20 may be generated by Algorithm 2 rovided only that iteration 2k is indeed unsuccessful, that is if fx 2k ) fx 2k + s 2k ) < η m 2k, where fx 2k +s 2k ) is the still unined value of our utative objective function at x 2k +s 2k = x 2k s 2k+3 This condition is obviously satisfied if we also imose that fx 2k s 2k+3) = f 2k 2k+3, where f2k+3 = max fx 2k+3 ), fx 2k ) 099 η m 2k, fx 2k+4 ) g 2 2k+4s 2k+3 45) Note that this last condition ensures that and also, since fx 2k ) = fx 2k+ ) > fx 2k+3 ), that fx 2k+2 ) = fx 2k+3 ) f 2k 2k+3 46) f 2k 2k+3 fx 2k+4) 2 g 2k+4s 2k+3 f 2k 2k+3 fx 2k+3), fx 2k+ ) 47) We now turn to the inition of the objective function fx) which must interolate function and gradient values at the iterates We start by noting that, for arbitrary a > 0 and s > 0, function values f a and f b and gradient values g a and g b, it is ossible to construct a function f as t) = f a + g a t + c as sinφ as t) + 48) on the interval a, a + s where the arameters c as and φ as 0, π can be determined to ensure that f as 0) = f a, g as 0) = g a, f a s) = f b and g as s) = g b Indeed, since we deduce that g as t) = g a + c as + )φ as sinφ as t) cosφ as t), 49) which may substitute in 48) to obtain that g b g a = c as + )φ as sinφ as s) cosφ as s), 420) f b f a = g a s + g b g a ) sinφ as s) + )φ as cosφ as s), and hence conclude that φ as s is the smallest ositive root θ as of the nonlinear equation sinθ) θ = ν as cosθ), where ν as = + ) f b f a g a s g b g a )s 42) It is easy to check that such a root always exist in 0, π 2 if ν as > Given φ as, or, equivalently, θ as = φ as s, we also obtain that c as = f b f a g a s sinθ as ) + We now use this interolation technique on each of the sequence of intervals secified in Table 42 Observe that the function is interolated for every successful ste in two ieces with an intermediate oint corresonding for all iterations beyond the first) to the enultimate

21 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 2 unsuccessful trial oint, where condition 45) is imosed as well as a zero gradient We also choose arbitrarily) f 2 = g +)/ 099 3η 2 g σ ) /) corresonding to a fictitious unsuccessful iteration of index k = 2 with g 2 = g and σ 2 = σ / + 2 ) ) Iteration Interolation interval Interolated values k a, a + s f a g a f b g b x, x + s 2 fx 0 ) = fx ) g f 2 0 x + s 2, x fx 2 ) g 2 3 x 3, x 3 + s 2 3 fx 2 ) = fx 3 ) g 3 f x 3 + s 2 3, x 4 f3 0 0 fx 4 ) g 4 5 x 5, x 5 + s 2 5 fx 4 ) = fx 5 ) g 5 f x 5 + s 2 5, x 6 f5 2 0 fx 6 ) g 6 2k + x 2k+, x 2k+ + s 2 2k+ fx 2k ) = fx 2k+ ) g 2k+ f 2k 2 2k+ 0 2k + x 2k+ + s 2 2k+, x 2k+2 f 2k 2 2k+ 0 fx 2k+2 ) g 2k+2 Table 42: Interolation conditions for successful iterations For the function 48) and its gradient 49) to be well-ined, we still need that ν as > for each interolation interval Consider the first such interval at iteration 2k + k 0) and ν 2k+, the value of ν as corresonding to that interval Using 46), we obtain that ν 2k+ 2k f2k+ = + ) fx 2k+) + g 2 2k+s 2k+ g + > 422) 2 2k+s 2k+ as desired For the second interolation interval at iteration 2k +, we have that ν 2 2k+ 2k 2 f2k+ = + ) fx 2k+2) ) g 2 2k+2s 2k+ + ) g 2 2k+2s 2k+ g 2 2k+2s 2k+ = + > 423) where we have used 45) to derive the inequality We therefore obtain from 422) and 423) that, for all k 0, the desired roots θ2k+ and θ2 2k+ exist and satisfy θ 2k+ π 2 and θ2 2k+ π 2 424) As a consequence sinφ i 2k+t) is ositive on each interolation interval i =, 2), and our interolating function and its gradient are also well-ined for each interval Moreover, since both ν2k+ and ν2 2k+ are bounded below by +, we obtain that there is a constant κ θ > 0 such that θ2k+ κ θ, π 2 and θ2 2k+ κ θ, π, 425) 2

22 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 22 and thus that there exists a constant κ sin > 0 indeendent of k such that sinθ 2k+ ) κ sin and sinθ 2 2k+ ) κ sin 426) Figure 4 shows the shae of the resulting function and Figure 42 the shae of its gradient, whose construction imlies that AS holds Figure 4 also shows the shae of the models m 2k x 2k + s) on the intervals x 2k, x 2k + s 2k = x 2k, x 2k s 2k+3 dashed lines), illustrating that the model is a bad redictor of the objective function value at the oint x 2k + s 2k, causing the unsuccessful nature of iteration 2k Note that fx) may be extended smoothly into a decreasing function for x < Figure 4: The shae of fx) for the first 8 successful iterations and the shae of the model at each unsuccessful iteration dashed) for = 045, = 2, τ = 000, η = 06 and q = 3 As can be checked in these figures, fx) is nonconvex and continuously differentiable The form 49) imlies that gx) varies very quickly at the beginning of each interolation interval, which is visible in Figure 42 We now investigate the roerties of our interolant further, and observe that, because of 4), 47), 47), the fact that fx 2k ) = fx 2k+ ) and the inequality g 2k+2 s 2k+ g 2k s 2k, f 2k 2 2k+ fx 2k+) maxfx 2k ) fx 2k+ ), fx 2k+2 ) 2 g 2k+2s 2k+ fx 2k+ ) x < g 2k s 2k < max m 2k, 2 g 2k+2s 2k+ < max g 2k s 2k, g 2k+2 s 2k+

23 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions Figure 42: The shae of gx) for the first 8 successful iterations for = 045, = 2, τ = 000, η = 06 and q = 3 and hence that ν2k+ = + ) f 2k 2 2k+ fx 2k+) + g 2 2k+s 2k+ g 2 2k+s 2k+ ) 2 + ) g g 2k+ s 2k+ 2k s 2k + g 2 2k+s 2k+ ) 2 + ) < 2 g g 2k+ s 2k+ 2k s 2k = 4 + ) g + 2k g 2k+ 427) 4 + ) where we used 42) Similarly, using 47), 40), 47), 4) and 4) in succession, we

24 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 24 obtain that ν2k+ 2 = + ) f 2k 2 2k+ fx 2k+2) g 2 2k+2s 2k+ 2 + ) max fx 2k ) fx 2k+2 ), 2 g 2k+2s 2k+ g 2k+2 s 2k+ 2 + ) max m 2k + m 2k+, g 2 2k+2s 2k+ g 2k+2 s 2k+ 2 m2k 2 + ) max g 2k+2 s 2k+, 2 = 2 + ) max 2 ) m2k g 2k+ s 2k+ m 2k+ g 2k+2 s 2k+, 2 = 2 + ) max 2 ) + g 2k g 2k+ s 2k+ g 2k+ g 2k+2 s 2k+, 2 = 2 + ) max 2 ) + g 2k g 2k+ g 2k+ g 2k+2, ) max 2 ), 2 428) We may therefore deduce from 427) and 428) that there exists a constant κ ν > 0 indeendent of k such that, for all k 0, ν 2k+ κ ν and ν 2 2k+ κ ν As a consequence, and since the nonlinear equation in 42) can be written in the form tanθ) = ν as θ, we obtain that θ as is uniformly bounded away from π 2 and hence that there exists a constant κ cos > 0 such that cosθ as ) = cosφ as s) κ cos 429) for every interolation interval Consider now 0 t < t 2 s for a given interolation interval a, a + s Because of 424), we then have that gt 2 ) gt ) = c as + )φ as { sinφ as t 2 ) cosφ as t 2 ) sinφ as t ) cosφ as t ) } c as + )φ as { sinφ as t 2 ) cosφ as t 2 ) sinφ as t 2 ) cosφ as t ) + sinφas t 2 ) cosφ as t ) sinφ as t ) cosφ as t ) } = c as + )φ as { sinφ as t 2 ) cosφ as t 2 ) cosφ as t ) + cosφ as t ) sinφ as t 2 ) sinφ as t ) } c as + )φ as { cosφas t 2 ) cosφ as t ) + sinφas t 2 ) sinφ as t ) } Now, using the mean-value theorem, π ) cosφ as t 2 ) cosφ as t ) = sinξ) φ as t 2 t φ 2 as t 2 t 430)

25 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 25 where ξ φ as t, φ as t 2 ) and where we have used the fact ) 2φas t 2 t ) 2φas t 2 t φ as t 2 t = π 2 π π 2 π because φ as t 2 t φ as s π 2 Moreover, using the inequality and the fact that u v u v for all u, v 0,, 43) ) φas sin 2 t 2 t ) < φ as 2 t 2 t ) since φ as t 2 t ) φ as s π 2, we deduce that sinφ as t 2 ) sinφ as t ) sinφ as t 2 ) sinφ as t ) ) = 2 cos φas 2 t ) 2 + t ) φas sin 2 t 2 t ) ) 2 sin φas 2 t 2 t ) < φ as t 2 t Thus, combining this inequality with 430), we obtain that π ) gt 2 ) gt ) + + ) c as φ + as t 2 t 432) 2 But we know from 46) that, for all k 0, g 2k+ = s 2k+ and g 2k+2 = g 2k+3 = s g 2k+3 2k+ g 2k+ s 2k+ As a consequence, we deduce using Table 42 that, for every interolation interval, g b g a 2 s because the length s of each interval is equal to half that of the corresonding successful ste Using this inequality and 420), we obtain that c as φ + as φ as g b g a + )sinθ s ) cosθ s ) 2 φ ass + )sinθ s ) cosθ s ) ) π2 2 + )κ sin κ cos 433) where we used the equality φ as s = θ as, 424), 426), and 429) to derive the last inequality Hence, we deduce from 432) that, for x and y belonging to the same interolation interval, π π ) gx) gy) κ sin x y = κ L 2 x y 434) cos

26 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 26 Consider now 0 x < y where x and y belong to different interolation intervals and assume first that y belongs to the interolation interval following that containing x Then, if z x, y) is the junction oint between the two successive intervals, gx) gy) gx) gz) + gz) gy) L 2 x z + L 2 z y 435) L x y where we use the triangle inequality, 434) on each interval, and the fact that u + v 2u + v) for all u, v 0, Consider finally 0 x < y where x and y belong to different interolation intervals, where y does not belong to the interval following that containing x Let us denote by r x the smallest root of g larger than x and by r y the largest root smaller than y Note that the existence of these roots is guaranteed by the construction of the interolating function f which ensures that stationary oint occurs at the junction between to two interolation intervals covering a single successful ste It is easy to verify that x and r x must belong either to the same interolation interval or to two successive intervals The same is true of r y and y, yielding that x r x and r y y 436) Moreover, using either 434) or 435), we have that gx) gr x ) L x r a and gr y ) gy) L r b y and we may deduce, using 436) and 43), that gx) gy) gx) gr x ) + gr y ) gy) L rx x) + y r y ) ) L r x x + y r y ) L y x 437) It then results from 434), 435) and 437) that gx) is Hölder continuous and AS2 is satisfied in our examle This is illustrated in Figure 43 We also note that, because of 425), the inition of θ as, the fact that <, 46) and 2 the decreasing nature of { g k }, we have that, for every interolation interval, φ as > κθ ) κ θ s g a κ θ g 0 Hence 49) and 433) ensure that gx) is bounded above for x 0, which, together with the inequalities fx k ) f >, s k and the mean-value theorem alied in each interval, guarantees that there exists a constant f low > such that fx) f low for all x 0 Thus AS3 holds with f target = and f = f low Moreover, AS4 trivially follows with κ gl = 0, κ gu = and ɛ = ɛ AS5 is satisfied by construction with κ B = 0 since we set B k = 0 for all k 0 We therefore conclude that all our assumtions hold and that our examle is valid, in that Algorithm 2 alied on fx) with arbitrarily small τ 0, ) in the case where > + needs at least 2 ɛ + +τ+) q)

27 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 27 Figure 43: The shae of the function gx) gy) / x y for the interval sanned by the first 8 successful iterations for = 045, = 2, τ = 000, η = 06 and q = 3 iterations and function evaluations) to obtain an iterate x ɛ such that gx ɛ ) ɛ Since q is indeendent of ɛ, this shows that the comlexity bound stated by art ii) of Theorem 30 is essentially shar 5 Discussion Which ower of ɛ < dominates in the comlexity bounds of Theorem 30 is illustrated in Figure 54 as a function of and It is interesting to note that the worst-case evaluation comlexity of our general class of regularized method does deend on the relative values of and Observe also that, when ɛ <, ɛ > ɛ + ) in the triangle for which + and 2 As can be seen in this figure, there is little incentive for a user to choose a regularization ower < 2, at least from the worst-case comlexity oint of view not to mention the need of AS4) It is also interesting to observe that, if 2, the comlexity no longer deends on the recise value of, but only deends on the smoothness of the objective function as measured by the Hölder exonent whose knowledge is not required a riori) In that sense, the algorithm adats itself to the roblem at hand without any further user tuning see also the universal gradient methods by Nesterov for the convex case 4) If ɛ that is if either ɛ or κ gl ), the results above simlify because negative owers of ɛ are bounded above by one As a consequence, all terms involving such owers which we ket exlicit in the analysis for ɛ < ) are absorbed in the constants, and the comlexity bounds of Theorem 30 essentially reduce to multiles of the difference fx 0 ) f Note also that Lemma 3 allows us to equate > with = and κ gl = gx 0 ) In this case, either ɛ = ɛ > gx 0 ) and Algorithm 2 stos at iteration 0, or ɛ = gx 0 ) and the bounds of Theorem 30 become indeendent of ɛ, resulting in a bound on the number of

28 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 28 5 O) Oε * /) ) 05 Oε * +/) ) + = Figure 54: Worst-case evaluation comlexity as a function of and in the cases where ɛ < iterations and evaluations directly roortional to fx 0 ) f target, as exected We conclude by observing that the theory resented above recovers known results see 5 for the case where = 3 and = and 3, 6 for the case where = 2 and = ); these cases corresond to the thick dots in Figure 54 References Alain Bensoussan and Jens Frehse Regularity results for nonlinear ellitic systems and alications Sringer Verlag, Heidelberg, Berlin, New York, C Cartis, N I M Gould, and Ph L Toint Trust-region and other regularization of linear least-squares roblems BIT, 49):2 53, C Cartis, N I M Gould, and Ph L Toint On the comlexity of steeest descent, Newton s and regularized Newton s methods for nonconvex unconstrained otimization SIAM Journal on Otimization, 206): , C Cartis, N I M Gould, and Ph L Toint Adative cubic overestimation methods for unconstrained otimization Part I: motivation, convergence and numerical results Mathematical Programming, Series A, 272): , 20 5 C Cartis, N I M Gould, and Ph L Toint Adative cubic overestimation methods for unconstrained otimization Part II: worst-case function-evaluation comlexity Mathematical Programming, Series A, 302):295 39, 20 6 C Cartis, N I M Gould, and Ph L Toint On the evaluation comlexity of comosite function minimization with alications to nonconvex nonlinear rogramming SIAM Journal on Otimization, 24):72 739, 20

29 Cartis, Gould, Toint: Comlexity of unconstrained otimization of C, functions 29 7 C Cartis, N I M Gould, and Ph L Toint On the evaluation comlexity of cubic regularization methods for otentially rank-icient nonlinear least-squares roblems and its relevance to constrained nonlinear otimization SIAM Journal on Otimization, 233):553574, N I M Gould, D P Robinson, and H S Thorne On solving trust-region and other regularised subroblems in otimization Mathematical Programming, Series C, 2):2 57, G N Graiglia, J Yuan, and Y Yuan Global convergence and worst-case comlexity of a derivative-free trust-region algorithm for comosite nonsmooth otimization Technical reort, University of Parana, Curitiba, Brasil, G N Graiglia, J Yuan, and Y Yuan On the convergence and worst-case comlexity of trust-region and regularization methods for unconstrained otimization Mathematical Programming, Series A, to aear), 204 S Gratton, A Sartenaer, and Ph L Toint Recursive trust-region methods for multiscale nonlinear otimization SIAM Journal on Otimization, 9):44 444, Yu Nesterov Introductory Lectures on Convex Otimization Alied Otimization Kluwer Academic Publishers, Dordrecht, The Netherlands, Yu Nesterov Gradient mehods for minimizing comosite objective functions Mathematical Programming, Series A, 40):25 6, Yu Nesterov Universal gradient methods for convex otimization roblems Technical Reort DP 203/2640, CORE, Catholic University of Louvain, Louvain-la-Neuve, Belgium, Yu Nesterov and B T Polyak Cubic regularization of Newton method and its global erformance Mathematical Programming, Series A, 08):77 205, Gas Processors and Suliers Association Engineering Data Book Vol 2 GPSA, Tulsa, USA, K Ueda A Regularized Newton Method without Line Search for Unconstrained Otimization PhD thesis, Deartment of Alied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Kyoto, Jaan, K Ueda and N Yamashita Convergence roerties of the regularized Newton method for the unconstrained nonconvex otimization Alied Mathematics & Otimization, 62):27 46, K Ueda and N Yamashita On a global comlexity bound of the Levenberg-Marquardt method Journal of Otimization Theory and Alications, 47: , L N Vicente Worst case comlexity of direct search EURO Journal on Comutational Otimization, :43 53, 203

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models Worst-case evaluation comlexity for unconstrained nonlinear otimization using high-order regularized models E. G. Birgin, J. L. Gardenghi, J. M. Martínez, S. A. Santos and Ph. L. Toint 2 Aril 26 Abstract

More information

On the complexity of the steepest-descent with exact linesearches

On the complexity of the steepest-descent with exact linesearches On the complexity of the steepest-descent with exact linesearches Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint 9 September 22 Abstract The worst-case complexity of the steepest-descent algorithm

More information

An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity

An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity Coralia Cartis, Nick Gould and Philippe Toint Department of Mathematics,

More information

On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization

On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization C. Cartis, N. I. M. Gould and Ph. L. Toint 22 September 2011 Abstract The (optimal) function/gradient

More information

Supplemental Material: Buyer-Optimal Learning and Monopoly Pricing

Supplemental Material: Buyer-Optimal Learning and Monopoly Pricing Sulemental Material: Buyer-Otimal Learning and Monooly Pricing Anne-Katrin Roesler and Balázs Szentes February 3, 207 The goal of this note is to characterize buyer-otimal outcomes with minimal learning

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization

Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint October 30, 200; Revised March 30, 20 Abstract

More information

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Coralia Cartis,, Nicholas I. M. Gould, and Philippe L. Toint September

More information

Adaptive cubic overestimation methods for unconstrained optimization

Adaptive cubic overestimation methods for unconstrained optimization Report no. NA-07/20 Adaptive cubic overestimation methods for unconstrained optimization Coralia Cartis School of Mathematics, University of Edinburgh, The King s Buildings, Edinburgh, EH9 3JZ, Scotland,

More information

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models Math. Program., Ser. A DOI 10.1007/s10107-017-1137-4 FULL LENGTH PAPER Global convergence rate analysis of unconstrained optimization methods based on probabilistic models C. Cartis 1 K. Scheinberg 2 Received:

More information

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018 Universal regularization methods varying the power, the smoothness and the accuracy arxiv:1811.07057v1 [math.oc] 16 Nov 2018 Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint Revision completed

More information

Trust Region Methods for Unconstrained Optimisation

Trust Region Methods for Unconstrained Optimisation Trust Region Methods for Unconstrained Optimisation Lecture 9, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Trust

More information

Information and uncertainty in a queueing system

Information and uncertainty in a queueing system Information and uncertainty in a queueing system Refael Hassin December 7, 7 Abstract This aer deals with the effect of information and uncertainty on rofits in an unobservable single server queueing system.

More information

SUBORDINATION BY ORTHOGONAL MARTINGALES IN L p, 1 < p Introduction: Orthogonal martingales and the Beurling-Ahlfors transform

SUBORDINATION BY ORTHOGONAL MARTINGALES IN L p, 1 < p Introduction: Orthogonal martingales and the Beurling-Ahlfors transform SUBORDINATION BY ORTHOGONAL MARTINGALES IN L, 1 < PRABHU JANAKIRAMAN AND ALEXANDER VOLBERG 1. Introduction: Orthogonal martingales and the Beurling-Ahlfors transform We are given two martingales on the

More information

Brownian Motion, the Gaussian Lévy Process

Brownian Motion, the Gaussian Lévy Process Brownian Motion, the Gaussian Lévy Process Deconstructing Brownian Motion: My construction of Brownian motion is based on an idea of Lévy s; and in order to exlain Lévy s idea, I will begin with the following

More information

LECTURE NOTES ON MICROECONOMICS

LECTURE NOTES ON MICROECONOMICS LECTURE NOTES ON MCROECONOMCS ANALYZNG MARKETS WTH BASC CALCULUS William M. Boal Part : Consumers and demand Chater 5: Demand Section 5.: ndividual demand functions Determinants of choice. As noted in

More information

SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION

SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION ISSN -58 (Paer) ISSN 5-5 (Online) Vol., No.9, SINGLE SAMPLING PLAN FOR VARIABLES UNDER MEASUREMENT ERROR FOR NON-NORMAL DISTRIBUTION Dr. ketki kulkarni Jayee University of Engineering and Technology Guna

More information

Forward Vertical Integration: The Fixed-Proportion Case Revisited. Abstract

Forward Vertical Integration: The Fixed-Proportion Case Revisited. Abstract Forward Vertical Integration: The Fixed-roortion Case Revisited Olivier Bonroy GAEL, INRA-ierre Mendès France University Bruno Larue CRÉA, Laval University Abstract Assuming a fixed-roortion downstream

More information

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0. Outline Coordinate Minimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University November 27, 208 Introduction 2 Algorithms Cyclic order with exact minimization

More information

On the Power of Structural Violations in Priority Queues

On the Power of Structural Violations in Priority Queues On the Power of Structural Violations in Priority Queues Amr Elmasry 1, Claus Jensen 2, Jyrki Katajainen 2, 1 Comuter Science Deartment, Alexandria University Alexandria, Egyt 2 Deartment of Comuting,

More information

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. In this paper we prove global

More information

Convergence of trust-region methods based on probabilistic models

Convergence of trust-region methods based on probabilistic models Convergence of trust-region methods based on probabilistic models A. S. Bandeira K. Scheinberg L. N. Vicente October 24, 2013 Abstract In this paper we consider the use of probabilistic or random models

More information

On the smallest abundant number not divisible by the first k primes

On the smallest abundant number not divisible by the first k primes On the smallest abundant number not divisible by the first k rimes Douglas E. Iannucci Abstract We say a ositive integer n is abundant if σ(n) > 2n, where σ(n) denotes the sum of the ositive divisors of

More information

Lecture Quantitative Finance Spring Term 2015

Lecture Quantitative Finance Spring Term 2015 implied Lecture Quantitative Finance Spring Term 2015 : May 7, 2015 1 / 28 implied 1 implied 2 / 28 Motivation and setup implied the goal of this chapter is to treat the implied which requires an algorithm

More information

Sampling Procedure for Performance-Based Road Maintenance Evaluations

Sampling Procedure for Performance-Based Road Maintenance Evaluations Samling Procedure for Performance-Based Road Maintenance Evaluations Jesus M. de la Garza, Juan C. Piñero, and Mehmet E. Ozbek Maintaining the road infrastructure at a high level of condition with generally

More information

Confidence Intervals for a Proportion Using Inverse Sampling when the Data is Subject to False-positive Misclassification

Confidence Intervals for a Proportion Using Inverse Sampling when the Data is Subject to False-positive Misclassification Journal of Data Science 13(015), 63-636 Confidence Intervals for a Proortion Using Inverse Samling when the Data is Subject to False-ositive Misclassification Kent Riggs 1 1 Deartment of Mathematics and

More information

A Stochastic Model of Optimal Debt Management and Bankruptcy

A Stochastic Model of Optimal Debt Management and Bankruptcy A Stochastic Model of Otimal Debt Management and Bankrutcy Alberto Bressan (, Antonio Marigonda (, Khai T. Nguyen (, and Michele Palladino ( (* Deartment of Mathematics, Penn State University University

More information

Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization

Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization Corrigendum: On the complexity of finding first-order critical points in constrained nonlinear optimization C. Cartis, N. I. M. Gould and Ph. L. Toint 11th November, 2014 Abstract In a recent paper (Cartis

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Report no. NA-07/09 Nonlinear programming without a penalty function or a filter Nicholas I. M. Gould Oxford University, Numerical Analysis Group Philippe L. Toint Department of Mathematics, FUNDP-University

More information

INDEX NUMBERS. Introduction

INDEX NUMBERS. Introduction INDEX NUMBERS Introduction Index numbers are the indicators which reflect changes over a secified eriod of time in rices of different commodities industrial roduction (iii) sales (iv) imorts and exorts

More information

Annex 4 - Poverty Predictors: Estimation and Algorithm for Computing Predicted Welfare Function

Annex 4 - Poverty Predictors: Estimation and Algorithm for Computing Predicted Welfare Function Annex 4 - Poverty Predictors: Estimation and Algorithm for Comuting Predicted Welfare Function The Core Welfare Indicator Questionnaire (CWIQ) is an off-the-shelf survey ackage develoed by the World Bank

More information

4: SINGLE-PERIOD MARKET MODELS

4: SINGLE-PERIOD MARKET MODELS 4: SINGLE-PERIOD MARKET MODELS Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 4: Single-Period Market Models 1 / 87 General Single-Period

More information

Convergence Analysis of Monte Carlo Calibration of Financial Market Models

Convergence Analysis of Monte Carlo Calibration of Financial Market Models Analysis of Monte Carlo Calibration of Financial Market Models Christoph Käbe Universität Trier Workshop on PDE Constrained Optimization of Certain and Uncertain Processes June 03, 2009 Monte Carlo Calibration

More information

Capital Budgeting: The Valuation of Unusual, Irregular, or Extraordinary Cash Flows

Capital Budgeting: The Valuation of Unusual, Irregular, or Extraordinary Cash Flows Caital Budgeting: The Valuation of Unusual, Irregular, or Extraordinary Cash Flows ichael C. Ehrhardt Philli R. Daves Finance Deartment, SC 424 University of Tennessee Knoxville, TN 37996-0540 423-974-1717

More information

A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation

A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation E Bergou Y Diouane V Kungurtsev C W Royer July 5, 08 Abstract Globally convergent variants of the Gauss-Newton

More information

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex

More information

Lecture 5: Performance Analysis (part 1)

Lecture 5: Performance Analysis (part 1) Lecture 5: Performance Analysis (art 1) 1 Tyical Time Measurements Dark grey: time sent on comutation, decreasing with # of rocessors White: time sent on communication, increasing with # of rocessors Oerations

More information

The Correlation Smile Recovery

The Correlation Smile Recovery Fortis Bank Equity & Credit Derivatives Quantitative Research The Correlation Smile Recovery E. Vandenbrande, A. Vandendorpe, Y. Nesterov, P. Van Dooren draft version : March 2, 2009 1 Introduction Pricing

More information

Online Robustness Appendix to Are Household Surveys Like Tax Forms: Evidence from the Self Employed

Online Robustness Appendix to Are Household Surveys Like Tax Forms: Evidence from the Self Employed Online Robustness Aendix to Are Household Surveys Like Tax Forms: Evidence from the Self Emloyed October 01 Erik Hurst University of Chicago Geng Li Board of Governors of the Federal Reserve System Benjamin

More information

Objectives. 3.3 Toward statistical inference

Objectives. 3.3 Toward statistical inference Objectives 3.3 Toward statistical inference Poulation versus samle (CIS, Chater 6) Toward statistical inference Samling variability Further reading: htt://onlinestatbook.com/2/estimation/characteristics.html

More information

Non-Inferiority Tests for the Ratio of Two Correlated Proportions

Non-Inferiority Tests for the Ratio of Two Correlated Proportions Chater 161 Non-Inferiority Tests for the Ratio of Two Correlated Proortions Introduction This module comutes ower and samle size for non-inferiority tests of the ratio in which two dichotomous resonses

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Nonlinear programming without a penalty function or a filter N I M Gould Ph L Toint October 1, 2007 RAL-TR-2007-016 c Science and Technology Facilities Council Enquires about copyright, reproduction and

More information

What can we do with numerical optimization?

What can we do with numerical optimization? Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016

More information

Nonlinear programming without a penalty function or a filter

Nonlinear programming without a penalty function or a filter Math. Program., Ser. A (2010) 122:155 196 DOI 10.1007/s10107-008-0244-7 FULL LENGTH PAPER Nonlinear programming without a penalty function or a filter N. I. M. Gould Ph.L.Toint Received: 11 December 2007

More information

CS522 - Exotic and Path-Dependent Options

CS522 - Exotic and Path-Dependent Options CS522 - Exotic and Path-Deendent Otions Tibor Jánosi May 5, 2005 0. Other Otion Tyes We have studied extensively Euroean and American uts and calls. The class of otions is much larger, however. A digital

More information

Matching Markets and Social Networks

Matching Markets and Social Networks Matching Markets and Social Networks Tilman Klum Emory University Mary Schroeder University of Iowa Setember 0 Abstract We consider a satial two-sided matching market with a network friction, where exchange

More information

ON THE MEAN VALUE OF THE SCBF FUNCTION

ON THE MEAN VALUE OF THE SCBF FUNCTION ON THE MEAN VALUE OF THE SCBF FUNCTION Zhang Xiaobeng Deartment of Mathematics, Northwest University Xi an, Shaani, P.R.China Abstract Keywords: The main urose of this aer is using the elementary method

More information

Quantitative Aggregate Effects of Asymmetric Information

Quantitative Aggregate Effects of Asymmetric Information Quantitative Aggregate Effects of Asymmetric Information Pablo Kurlat February 2012 In this note I roose a calibration of the model in Kurlat (forthcoming) to try to assess the otential magnitude of the

More information

University of Edinburgh, Edinburgh EH9 3JZ, United Kingdom.

University of Edinburgh, Edinburgh EH9 3JZ, United Kingdom. An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity by C. Cartis 1, N. I. M. Gould 2 and Ph. L. Toint 3 February 20, 2009;

More information

A Comparative Study of Various Loss Functions in the Economic Tolerance Design

A Comparative Study of Various Loss Functions in the Economic Tolerance Design A Comarative Study of Various Loss Functions in the Economic Tolerance Design Jeh-Nan Pan Deartment of Statistics National Chen-Kung University, Tainan, Taiwan 700, ROC Jianbiao Pan Deartment of Industrial

More information

A TRAJECTORIAL INTERPRETATION OF DOOB S MARTINGALE INEQUALITIES

A TRAJECTORIAL INTERPRETATION OF DOOB S MARTINGALE INEQUALITIES A RAJECORIAL INERPREAION OF DOOB S MARINGALE INEQUALIIES B. ACCIAIO, M. BEIGLBÖCK, F. PENKNER, W. SCHACHERMAYER, AND J. EMME Abstract. We resent a unified aroach to Doob s L maximal inequalities for 1

More information

Monetary policy is a controversial

Monetary policy is a controversial Inflation Persistence: How Much Can We Exlain? PAU RABANAL AND JUAN F. RUBIO-RAMÍREZ Rabanal is an economist in the monetary and financial systems deartment at the International Monetary Fund in Washington,

More information

Effects of Size and Allocation Method on Stock Portfolio Performance: A Simulation Study

Effects of Size and Allocation Method on Stock Portfolio Performance: A Simulation Study 2011 3rd International Conference on Information and Financial Engineering IPEDR vol.12 (2011) (2011) IACSIT Press, Singaore Effects of Size and Allocation Method on Stock Portfolio Performance: A Simulation

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Asian Economic and Financial Review A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION. Ben David Nissim.

Asian Economic and Financial Review A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION. Ben David Nissim. Asian Economic and Financial Review journal homeage: htt://www.aessweb.com/journals/5 A MODEL FOR ESTIMATING THE DISTRIBUTION OF FUTURE POPULATION Ben David Nissim Deartment of Economics and Management,

More information

Buyer-Optimal Learning and Monopoly Pricing

Buyer-Optimal Learning and Monopoly Pricing Buyer-Otimal Learning and Monooly Pricing Anne-Katrin Roesler and Balázs Szentes January 2, 217 Abstract This aer analyzes a bilateral trade model where the buyer s valuation for the object is uncertain

More information

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem.

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Robert M. Gower. October 3, 07 Introduction This is an exercise in proving the convergence

More information

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization A Trust Region Algorithm for Heterogeneous Multiobjective Optimization Jana Thomann and Gabriele Eichfelder 8.0.018 Abstract This paper presents a new trust region method for multiobjective heterogeneous

More information

A GENERALISED PRICE-SCORING MODEL FOR TENDER EVALUATION

A GENERALISED PRICE-SCORING MODEL FOR TENDER EVALUATION 019-026 rice scoring 9/20/05 12:12 PM Page 19 A GENERALISED PRICE-SCORING MODEL FOR TENDER EVALUATION Thum Peng Chew BE (Hons), M Eng Sc, FIEM, P. Eng, MIEEE ABSTRACT This aer rooses a generalised rice-scoring

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

A Multi-Objective Approach to Portfolio Optimization

A Multi-Objective Approach to Portfolio Optimization RoseHulman Undergraduate Mathematics Journal Volume 8 Issue Article 2 A MultiObjective Aroach to Portfolio Otimization Yaoyao Clare Duan Boston College, sweetclare@gmail.com Follow this and additional

More information

B. Online Appendix. where ɛ may be arbitrarily chosen to satisfy 0 < ɛ < s 1 and s 1 is defined in (B1). This can be rewritten as

B. Online Appendix. where ɛ may be arbitrarily chosen to satisfy 0 < ɛ < s 1 and s 1 is defined in (B1). This can be rewritten as B Online Appendix B1 Constructing examples with nonmonotonic adoption policies Assume c > 0 and the utility function u(w) is increasing and approaches as w approaches 0 Suppose we have a prior distribution

More information

C (1,1) (1,2) (2,1) (2,2)

C (1,1) (1,2) (2,1) (2,2) TWO COIN MORRA This game is layed by two layers, R and C. Each layer hides either one or two silver dollars in his/her hand. Simultaneously, each layer guesses how many coins the other layer is holding.

More information

Midterm Exam: Tuesday 28 March in class Sample exam problems ( Homework 5 ) available tomorrow at the latest

Midterm Exam: Tuesday 28 March in class Sample exam problems ( Homework 5 ) available tomorrow at the latest Plan Martingales 1. Basic Definitions 2. Examles 3. Overview of Results Reading: G&S Section 12.1-12.4 Next Time: More Martingales Midterm Exam: Tuesday 28 March in class Samle exam roblems ( Homework

More information

A class of coherent risk measures based on one-sided moments

A class of coherent risk measures based on one-sided moments A class of coherent risk measures based on one-sided moments T. Fischer Darmstadt University of Technology November 11, 2003 Abstract This brief paper explains how to obtain upper boundaries of shortfall

More information

Statistics and Probability Letters. Variance stabilizing transformations of Poisson, binomial and negative binomial distributions

Statistics and Probability Letters. Variance stabilizing transformations of Poisson, binomial and negative binomial distributions Statistics and Probability Letters 79 (9) 6 69 Contents lists available at ScienceDirect Statistics and Probability Letters journal homeage: www.elsevier.com/locate/staro Variance stabilizing transformations

More information

Ordering a deck of cards... Lecture 3: Binomial Distribution. Example. Permutations & Combinations

Ordering a deck of cards... Lecture 3: Binomial Distribution. Example. Permutations & Combinations Ordering a dec of cards... Lecture 3: Binomial Distribution Sta 111 Colin Rundel May 16, 2014 If you have ever shuffled a dec of cards you have done something no one else has ever done before or will ever

More information

( ) ( ) β. max. subject to. ( ) β. x S

( ) ( ) β. max. subject to. ( ) β. x S Intermediate Microeconomic Theory: ECON 5: Alication of Consumer Theory Constrained Maimization In the last set of notes, and based on our earlier discussion, we said that we can characterize individual

More information

***SECTION 7.1*** Discrete and Continuous Random Variables

***SECTION 7.1*** Discrete and Continuous Random Variables ***SECTION 7.*** Discrete and Continuous Random Variables Samle saces need not consist of numbers; tossing coins yields H s and T s. However, in statistics we are most often interested in numerical outcomes

More information

Chapter 7 One-Dimensional Search Methods

Chapter 7 One-Dimensional Search Methods Chapter 7 One-Dimensional Search Methods An Introduction to Optimization Spring, 2014 1 Wei-Ta Chu Golden Section Search! Determine the minimizer of a function over a closed interval, say. The only assumption

More information

Objectives. 5.2, 8.1 Inference for a single proportion. Categorical data from a simple random sample. Binomial distribution

Objectives. 5.2, 8.1 Inference for a single proportion. Categorical data from a simple random sample. Binomial distribution Objectives 5.2, 8.1 Inference for a single roortion Categorical data from a simle random samle Binomial distribution Samling distribution of the samle roortion Significance test for a single roortion Large-samle

More information

Lecture 2. Main Topics: (Part II) Chapter 2 (2-7), Chapter 3. Bayes Theorem: Let A, B be two events, then. The probabilities P ( B), probability of B.

Lecture 2. Main Topics: (Part II) Chapter 2 (2-7), Chapter 3. Bayes Theorem: Let A, B be two events, then. The probabilities P ( B), probability of B. STT315, Section 701, Summer 006 Lecture (Part II) Main Toics: Chater (-7), Chater 3. Bayes Theorem: Let A, B be two events, then B A) = A B) B) A B) B) + A B) B) The robabilities P ( B), B) are called

More information

Homework #5 7 th week Math 240 Thursday October 24, 2013

Homework #5 7 th week Math 240 Thursday October 24, 2013 . Let a, b > be integers and g : = gcd(a, b) its greatest common divisor. Show that if a = g q a and b = g q b then q a and q b are relatively rime. Since gcd(κ a, κ b) = κ gcd(a, b) in articular, for

More information

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION Szabolcs Sebestyén szabolcs.sebestyen@iscte.pt Master in Finance INVESTMENTS Sebestyén (ISCTE-IUL) Choice Theory Investments 1 / 65 Outline 1 An Introduction

More information

Sublinear Time Algorithms Oct 19, Lecture 1

Sublinear Time Algorithms Oct 19, Lecture 1 0368.416701 Sublinear Time Algorithms Oct 19, 2009 Lecturer: Ronitt Rubinfeld Lecture 1 Scribe: Daniel Shahaf 1 Sublinear-time algorithms: motivation Twenty years ago, there was practically no investigation

More information

Quality Regulation without Regulating Quality

Quality Regulation without Regulating Quality 1 Quality Regulation without Regulating Quality Claudia Kriehn, ifo Institute for Economic Research, Germany March 2004 Abstract Against the background that a combination of rice-ca and minimum uality

More information

A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION

A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION A NOTE ON SKEW-NORMAL DISTRIBUTION APPROXIMATION TO THE NEGATIVE BINOMAL DISTRIBUTION JYH-JIUAN LIN 1, CHING-HUI CHANG * AND ROSEMARY JOU 1 Deartment of Statistics Tamkang University 151 Ying-Chuan Road,

More information

: now we have a family of utility functions for wealth increments z indexed by initial wealth w.

: now we have a family of utility functions for wealth increments z indexed by initial wealth w. Lotteries with Money Payoffs, continued Fix u, let w denote wealth, and set u ( z) u( z w) : now we have a family of utility functions for wealth increments z indexed by initial wealth w. (a) Recall from

More information

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee CS 3331 Numerical Methods Lecture 2: Functions of One Variable Cherung Lee Outline Introduction Solving nonlinear equations: find x such that f(x ) = 0. Binary search methods: (Bisection, regula falsi)

More information

Chapter 3: Black-Scholes Equation and Its Numerical Evaluation

Chapter 3: Black-Scholes Equation and Its Numerical Evaluation Chapter 3: Black-Scholes Equation and Its Numerical Evaluation 3.1 Itô Integral 3.1.1 Convergence in the Mean and Stieltjes Integral Definition 3.1 (Convergence in the Mean) A sequence {X n } n ln of random

More information

TESTING THE CAPITAL ASSET PRICING MODEL AFTER CURRENCY REFORM: THE CASE OF ZIMBABWE STOCK EXCHANGE

TESTING THE CAPITAL ASSET PRICING MODEL AFTER CURRENCY REFORM: THE CASE OF ZIMBABWE STOCK EXCHANGE TESTING THE CAPITAL ASSET PRICING MODEL AFTER CURRENCY REFORM: THE CASE OF ZIMBABWE STOCK EXCHANGE Batsirai Winmore Mazviona 1 ABSTRACT The Caital Asset Pricing Model (CAPM) endeavors to exlain the relationshi

More information

1 < = α σ +σ < 0. Using the parameters and h = 1/365 this is N ( ) = If we use h = 1/252, the value would be N ( ) =

1 < = α σ +σ < 0. Using the parameters and h = 1/365 this is N ( ) = If we use h = 1/252, the value would be N ( ) = Chater 6 Value at Risk Question 6.1 Since the rice of stock A in h years (S h ) is lognormal, 1 < = α σ +σ < 0 ( ) P Sh S0 P h hz σ α σ α = P Z < h = N h. σ σ (1) () Using the arameters and h = 1/365 this

More information

On the Power of Structural Violations in Priority Queues

On the Power of Structural Violations in Priority Queues On the Power of Structural Violations in Priority Queues Amr Elmasry 1 Claus Jensen 2 Jyrki Katajainen 2 1 Deartment of Comuter Engineering and Systems, Alexandria University Alexandria, Egyt 2 Deartment

More information

Inventory Systems with Stochastic Demand and Supply: Properties and Approximations

Inventory Systems with Stochastic Demand and Supply: Properties and Approximations Working Paer, Forthcoming in the Euroean Journal of Oerational Research Inventory Systems with Stochastic Demand and Suly: Proerties and Aroximations Amanda J. Schmitt Center for Transortation and Logistics

More information

The Impact of Flexibility And Capacity Allocation On The Performance of Primary Care Practices

The Impact of Flexibility And Capacity Allocation On The Performance of Primary Care Practices University of Massachusetts Amherst ScholarWorks@UMass Amherst Masters Theses 1911 - February 2014 2010 The Imact of Flexibility And Caacity Allocation On The Performance of Primary Care Practices Liang

More information

Richardson Extrapolation Techniques for the Pricing of American-style Options

Richardson Extrapolation Techniques for the Pricing of American-style Options Richardson Extrapolation Techniques for the Pricing of American-style Options June 1, 2005 Abstract Richardson Extrapolation Techniques for the Pricing of American-style Options In this paper we re-examine

More information

Dynamic Stability of the Nash Equilibrium for a Bidding Game

Dynamic Stability of the Nash Equilibrium for a Bidding Game Dynamic Stability of the Nash Equilibrium for a Bidding Game Alberto Bressan and Hongxu Wei Deartment of Mathematics, Penn State University, University Park, Pa 16802, USA e-mails: bressan@mathsuedu, xiaoyitangwei@gmailcom

More information

On the Lower Arbitrage Bound of American Contingent Claims

On the Lower Arbitrage Bound of American Contingent Claims On the Lower Arbitrage Bound of American Contingent Claims Beatrice Acciaio Gregor Svindland December 2011 Abstract We prove that in a discrete-time market model the lower arbitrage bound of an American

More information

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models IEOR E4707: Foundations of Financial Engineering c 206 by Martin Haugh Martingale Pricing Theory in Discrete-Time and Discrete-Space Models These notes develop the theory of martingale pricing in a discrete-time,

More information

Publication Efficiency at DSI FEM CULS An Application of the Data Envelopment Analysis

Publication Efficiency at DSI FEM CULS An Application of the Data Envelopment Analysis Publication Efficiency at DSI FEM CULS An Alication of the Data Enveloment Analysis Martin Flégl, Helena Brožová 1 Abstract. The education and research efficiency at universities has always been very imortant

More information

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey

Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey Methods and Models of Loss Reserving Based on Run Off Triangles: A Unifying Survey By Klaus D Schmidt Lehrstuhl für Versicherungsmathematik Technische Universität Dresden Abstract The present paper provides

More information

Physical and Financial Virtual Power Plants

Physical and Financial Virtual Power Plants Physical and Financial Virtual Power Plants by Bert WILLEMS Public Economics Center for Economic Studies Discussions Paer Series (DPS) 05.1 htt://www.econ.kuleuven.be/ces/discussionaers/default.htm Aril

More information

ON JARQUE-BERA TESTS FOR ASSESSING MULTIVARIATE NORMALITY

ON JARQUE-BERA TESTS FOR ASSESSING MULTIVARIATE NORMALITY Journal of Statistics: Advances in Theory and Alications Volume, umber, 009, Pages 07-0 O JARQUE-BERA TESTS FOR ASSESSIG MULTIVARIATE ORMALITY KAZUYUKI KOIZUMI, AOYA OKAMOTO and TAKASHI SEO Deartment of

More information

A Semi-parametric Test for Drift Speci cation in the Di usion Model

A Semi-parametric Test for Drift Speci cation in the Di usion Model A Semi-arametric est for Drift Seci cation in the Di usion Model Lin hu Indiana University Aril 3, 29 Abstract In this aer, we roose a misseci cation test for the drift coe cient in a semi-arametric di

More information

2/20/2013. of Manchester. The University COMP Building a yes / no classifier

2/20/2013. of Manchester. The University COMP Building a yes / no classifier COMP4 Lecture 6 Building a yes / no classifier Buildinga feature-basedclassifier Whatis a classifier? What is an information feature? Building a classifier from one feature Probability densities and the

More information

Individual Comparative Advantage and Human Capital Investment under Uncertainty

Individual Comparative Advantage and Human Capital Investment under Uncertainty Individual Comarative Advantage and Human Caital Investment under Uncertainty Toshihiro Ichida Waseda University July 3, 0 Abstract Secialization and the division of labor are the sources of high roductivity

More information

Management Accounting of Production Overheads by Groups of Equipment

Management Accounting of Production Overheads by Groups of Equipment Asian Social Science; Vol. 11, No. 11; 2015 ISSN 1911-2017 E-ISSN 1911-2025 Published by Canadian Center of Science and Education Management Accounting of Production verheads by Grous of Equiment Sokolov

More information

Professor Huihua NIE, PhD School of Economics, Renmin University of China HOLD-UP, PROPERTY RIGHTS AND REPUTATION

Professor Huihua NIE, PhD School of Economics, Renmin University of China   HOLD-UP, PROPERTY RIGHTS AND REPUTATION Professor uihua NIE, PhD School of Economics, Renmin University of China E-mail: niehuihua@gmail.com OD-UP, PROPERTY RIGTS AND REPUTATION Abstract: By introducing asymmetric information of investors abilities

More information

25 Increasing and Decreasing Functions

25 Increasing and Decreasing Functions - 25 Increasing and Decreasing Functions It is useful in mathematics to define whether a function is increasing or decreasing. In this section we will use the differential of a function to determine this

More information

Economic Performance, Wealth Distribution and Credit Restrictions under variable investment: The open economy

Economic Performance, Wealth Distribution and Credit Restrictions under variable investment: The open economy Economic Performance, Wealth Distribution and Credit Restrictions under variable investment: The oen economy Ronald Fischer U. de Chile Diego Huerta Banco Central de Chile August 21, 2015 Abstract Potential

More information