Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India
|
|
- Ursula Lindsey
- 5 years ago
- Views:
Transcription
1 Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India Presented at OSL workshop, Les Houches, France. Joint work with Prateek Jain, Sham M. Kakade, Rahul Kidambi and Aaron Sidford
2 Linear regression min x f x = Ax b 2 2 x R d, A R n d, b R n Basic problem and arises pervasively in applications Deeply studied in literature
3 Gradient descent for linear regression x t+1 = x t δ A Ax t b Convergence rate: O κ log f x 0 f Gradient f = min x f x ; = Target suboptimality Condition number: κ = σ max A A σ min A A
4 Question: Is it possible to do better? Hope: GD does not reuse past gradients Answer: Yes! Gradient descent: x t+1 = x t δ f x t Conjugate gradient (Hestenes and Stiefel 1952) Heavy ball method (Polyak 1964) Accelerated gradient descent (Nemirovsky and Yudin 1977, Nesterov 1983)
5 Accelerated gradient descent (AGD) x t+1 = y t δ f y t y t+1 = x t+1 + γ x t+1 x t Convergence rate: O κ log f x 0 f f = min x f x ; = Target suboptimality Condition number: κ = σ max A A σ min A A
6 Accelerated gradient descent (AGD) Compared to: O κ log f x 0 f for GD Convergence rate: O κ log f x 0 f f = min x f x ; = Target suboptimality Condition number: κ = σ max A A σ min A A
7 Source:
8 Stochastic approximation (Robbins and Monro 1951) Distribution D on R d R f x E a,b D a x b 2 Equivalent to Ax b 2 2 where A has infinite rows Observe n pairs a 1, b 1,, a n, b n Interested in entire distribution D rather than data points like in ML Fit a linear model to the distribution Cannot compute exact gradients
9 Stochastic gradient descent (SGD) (Robbins and Monro 1951) x t+1 = x t δ f x t where E f x t = f x t Return 1 n σ i x i (Polyak and Juditsky 1992) Is gradient descent in expectation For linear regression, SGD: x t+1 = x t δ a t x t b t a t Streaming algorithm: extremely efficient and widely used in practice
10 Best possible rate Consider b = a x + noise; noise N 0, σ 2 Recall: f x E a,b D a x b 2 x argmin x σn i=1 a i x b i 2 E f x f x = 1 + o 1 σ2 d n (van der Vaart, 2000)
11 Best possible rate In general: x argmin x f x E a x b 2 aa σ 2 E aa x argmin x n σ i=1 a i x b 2 Equivalently n 1 + o 1 σ2 d E f x f x 1 + o 1 σ2 d n (van der Vaart, 2000)
12 Convergence rate of SGD Convergence rate: O κ log f x 0 f + σ2 d (Jain et al. 2016) f = min x f x ; = Target suboptimality Condition number: κ max a 2 2 σ min E aa Noise level: E a x b 2 aa σ 2 E aa
13 Recap Deterministic case GD O κ log f x 0 f AGD O κ log f x 0 f Stochastic approximation SGD O κ log f x 0 f Accelerated SGD? Unknown + σ2 d Question: Is accelerating SGD possible?
14 Is this really important? Extremely important in practice As we saw, acceleration can really give orders of magnitude improvement Neural network training uses Nesterov s AGD as well as Adam; but no theoretical understanding Jain et al shows acceleration leads to more parallelizability Existing results show AGD not robust to deterministic noise (d Aspremont 2008, Devolder et al. 2014) but is robust to random additive noise (Ghadimi and Lan 2010, Dieuleveut et al. 2016) Stochastic approximation falls between the above two cases Key issue: mixes optimization and statistics (i.e., # iterations = #samples)
15 Is acceleration possible? b = a x Noise level: σ 2 = 0 SGD convergence rate: O κ log f x 0 f Accelerated rate: O κ log f x 0 f?
16 Example I: Discrete distribution a = with probability p i In this case, κ max a 2 2 σ min E aa = 1 p min Is O κ log f x 0 f possible? Or, halve the error using O κ samples?
17 Example I: Discrete distribution a = with probability p i ; κ 1 p min Fewer than κ samples do not observe p min direction σ i a i a i not invertible Cannot do better than O κ Acceleration not possible
18 Example II: Gaussian a N 0, H, H is a PSD matrix In this case, κ Tr H σ min H d However, after O d samples: 1 n σ i a i a i H Possible to solve a i x = b i after O d samples Acceleration might be possible
19 Discrete vs Gaussian Discrete distribution Gaussian distribution
20 Key issue: matrix spectral concentration Recall: a i D. Let H E a i a i. For x argmin x σn i=1 a i x b i 2 to be good, need: n 1 δ H 1 n i=1 a i a i 1 + δ H How many samples are required for spectral concentration?
21 Separating optimization and statistics Matrix variance (Tropp 2012): E a 2 2 aa 2 Recall H E aa 2 Statistical condition number: κǁ E H 1 2a 2 H 1 2a H 1 2a 2 Matrix Bernstein Theorem (Tropp 2015) If n > O κ ǁ, then 1 δ H 1 σ n i=1 n a i a i 1 + δ H
22 Is acceleration possible? O κǁ samples sufficient Recall SGD convergence rate: O κ log f x 0 f Always κǁ κ. Acceleration might be possible if κǁ κ Discrete case: κ ǁ = 1 = κ; p min Gaussian case: κ ǁ = O d κ
23 Result Convergence rate of ASGD: O κκǁ log f x 0 f + σ2 d Compared to SGD: O κ log f x 0 f + σ2 d Improvement since κǁ κ Conjecture: lower bound Ω Srebro 2016) κκǁ log f x 0 f (inspired by Woodworth and Key takeaway Acceleration possible! Gain depends on statistical condition number
24 Simulations No noise Discrete distribution Gaussian distribution
25 Simulations With noise Discrete distribution Gaussian distribution
26 High level challenges Several versions of accelerated algorithms known e.g., conjugate gradient 1952, heavy ball 1964, momentum methods 1983, accelerated coordinate descent 2012, linear coupling 2014 Many of them are equivalent in deterministic setting but not in stochastic setting Many different analyses even for momentum methods: Nesterov s analysis 1983, coordinate descent 2012, ODE analysis 2013, linear coupling 2014
27 Algorithm Parameters: α, β, γ, δ 1. v 0 = x 0 2. y t 1 = αx t α v t 1 3. x t = y t 1 δ f y t 1 4. z t 1 = βy t β v t 1 5. v t = z t 1 γ f y t 1 Parameters: α, β, γ, δ 1. v 0 = x 0 2. y t 1 = αx t α v t 1 3. x t = y t 1 δ t f y t 1 4. z t 1 = βy t β v t 1 5. v t = z t 1 γ t f y t 1 Nesterov 2012 Our algorithm
28 Proof overview Recall our guarantee: O κκǁ log f x 0 f + σ2 d First term depends on initial error; second is statistical error Different analyses for the two terms For the first term, analyze assuming σ = 0 For the second term, analyze assuming x 0 = x
29 Part I: Potential function Iterates x t, v t of ASGD. H E aa. Existing analyses use potential function x t x H 2 + σ min H v t x 2 2 We use x t x σ min H v t x 2 H 1 We show x t x σ min H v t x 2 H κκ x t 1 x σ min H v t 1 x 2 H 1
30 Part II: Stochastic process analysis x t+1 x y t+1 x = C x t x y t x + noise Parameters: α, β, γ, δ 1. v 0 = x 0 Let θ t E x t x y t x x t x y t x 2. y t 1 = αx t α v t 1 3. x t = y t 1 δ f y t 1 θ t+1 = Bθ t + noise noise θ n B i noise noise i = I B 1 noise noise 4. z t 1 = βy t β v t 1 5. v t = z t 1 γ f y t 1 Our algorithm
31 Part II: Stochastic process analysis Need to understand I B 1 noise noise B has singular values > 1, but fortunately eigenvalues < 1 Solve the 1-dim version of I B 1 noise noise computations via explicit Combine the 1-dim bounds with (statistical) condition number bounds I B 1 noise noise κh ǁ 1 + δ I
32 Recap Deterministic case GD O κ log f x 0 f AGD O κ log f x 0 f O Stochastic approximation SGD O κ log f x 0 f ASGD κκǁ log f x 0 f + σ2 d + σ2 d Acceleration possible depends on statistical condition number Techniques: new potential function, stochastic process analysis Conjecture: Our result is tight
33 Streaming optimization for ML Streaming algorithms are very powerful for ML applications SGD and variants widely used in practice Classical stochastic approximation focuses on asymptotic rates Tools from optimization help obtain strong finite sample guarantees Have implications for parallelization as well
34 Some examples Linear regression Finite sample guarantees: Moulines and Bach 2011, Defossez and Bach 2015 Parallelization: Jain et al Acceleration: This talk Smooth convex functions: Finite sample guarantees: Bach and Moulines 2013 PCA: Oja s algorithm Rank-1: Balsubramani et al. 2013, Jain et al Higher rank: Allen-Zhu and Li 2016
35 Open problems Linear regression: Parameter free algorithm e.g., conjugate gradient General convex functions: Acceleration, parallelization? Non-convex functions: Streaming algorithms, acceleration, parallelization? PCA: Tight finite sample guarantees? Quasi-Newton methods
36 Thank you! Questions?
Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization
for Strongly Convex Stochastic Optimization Microsoft Research New England NIPS 2011 Optimization Workshop Stochastic Convex Optimization Setting Goal: Optimize convex function F ( ) over convex domain
More informationExercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem.
Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Robert M. Gower. October 3, 07 Introduction This is an exercise in proving the convergence
More informationLarge-Scale SVM Optimization: Taking a Machine Learning Perspective
Large-Scale SVM Optimization: Taking a Machine Learning Perspective Shai Shalev-Shwartz Toyota Technological Institute at Chicago Joint work with Nati Srebro Talk at NEC Labs, Princeton, August, 2008 Shai
More informationApproximate Composite Minimization: Convergence Rates and Examples
ISMP 2018 - Bordeaux Approximate Composite Minimization: Convergence Rates and S. Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi MLO Lab, EPFL, Switzerland sebastian.stich@epfl.ch July 4, 2018
More informationTrust Region Methods for Unconstrained Optimisation
Trust Region Methods for Unconstrained Optimisation Lecture 9, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Trust
More informationStochastic Approximation Algorithms and Applications
Harold J. Kushner G. George Yin Stochastic Approximation Algorithms and Applications With 24 Figures Springer Contents Preface and Introduction xiii 1 Introduction: Applications and Issues 1 1.0 Outline
More informationIDENTIFYING BROAD AND NARROW FINANCIAL RISK FACTORS VIA CONVEX OPTIMIZATION: PART II
1 IDENTIFYING BROAD AND NARROW FINANCIAL RISK FACTORS VIA CONVEX OPTIMIZATION: PART II Alexander D. Shkolnik ads2@berkeley.edu MMDS Workshop. June 22, 2016. joint with Jeffrey Bohn and Lisa Goldberg. Identifying
More informationIs Greedy Coordinate Descent a Terrible Algorithm?
Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random
More informationSupport Vector Machines: Training with Stochastic Gradient Descent
Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM
More informationAsymptotic results discrete time martingales and stochastic algorithms
Asymptotic results discrete time martingales and stochastic algorithms Bernard Bercu Bordeaux University, France IFCAM Summer School Bangalore, India, July 2015 Bernard Bercu Asymptotic results for discrete
More informationPart 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)
Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective
More informationFirst-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016
First-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) First-Order Methods IMA, August 2016 1 / 48 Smooth
More informationOutline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.
Outline Coordinate Minimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University November 27, 208 Introduction 2 Algorithms Cyclic order with exact minimization
More informationFast Convergence of Regress-later Series Estimators
Fast Convergence of Regress-later Series Estimators New Thinking in Finance, London Eric Beutner, Antoon Pelsser, Janina Schweizer Maastricht University & Kleynen Consultants 12 February 2014 Beutner Pelsser
More information1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016
AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex
More informationMulti-period Portfolio Choice and Bayesian Dynamic Models
Multi-period Portfolio Choice and Bayesian Dynamic Models Petter Kolm and Gordon Ritter Courant Institute, NYU Paper appeared in Risk Magazine, Feb. 25 (2015) issue Working paper version: papers.ssrn.com/sol3/papers.cfm?abstract_id=2472768
More informationMachine Learning for Quantitative Finance
Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing
More informationLecture 17: More on Markov Decision Processes. Reinforcement learning
Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture
More informationMarkov Decision Processes II
Markov Decision Processes II Daisuke Oyama Topics in Economic Theory December 17, 2014 Review Finite state space S, finite action space A. The value of a policy σ A S : v σ = β t Q t σr σ, t=0 which satisfies
More informationAdaptive cubic overestimation methods for unconstrained optimization
Report no. NA-07/20 Adaptive cubic overestimation methods for unconstrained optimization Coralia Cartis School of Mathematics, University of Edinburgh, The King s Buildings, Edinburgh, EH9 3JZ, Scotland,
More informationConvergence of trust-region methods based on probabilistic models
Convergence of trust-region methods based on probabilistic models A. S. Bandeira K. Scheinberg L. N. Vicente October 24, 2013 Abstract In this paper we consider the use of probabilistic or random models
More informationStochastic Proximal Algorithms with Applications to Online Image Recovery
1/24 Stochastic Proximal Algorithms with Applications to Online Image Recovery Patrick Louis Combettes 1 and Jean-Christophe Pesquet 2 1 Mathematics Department, North Carolina State University, Raleigh,
More informationMachine Learning (CSE 446): Pratical issues: optimization and learning
Machine Learning (CSE 446): Pratical issues: optimization and learning John Thickstun guest lecture c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 10 Review 1 / 10 Our running example
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}
More informationChapter 5 Finite Difference Methods. Math6911 W07, HM Zhu
Chapter 5 Finite Difference Methods Math69 W07, HM Zhu References. Chapters 5 and 9, Brandimarte. Section 7.8, Hull 3. Chapter 7, Numerical analysis, Burden and Faires Outline Finite difference (FD) approximation
More informationWhat can we do with numerical optimization?
Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016
More informationDynamic Replication of Non-Maturing Assets and Liabilities
Dynamic Replication of Non-Maturing Assets and Liabilities Michael Schürle Institute for Operations Research and Computational Finance, University of St. Gallen, Bodanstr. 6, CH-9000 St. Gallen, Switzerland
More informationDistributed Approaches to Mirror Descent for Stochastic Learning over Rate-Limited Networks
Distributed Approaches to Mirror Descent for Stochastic Learning over Rate-Limited Networks, Detroit MI (joint work with Waheed Bajwa, Rutgers) Motivation: Autonomous Driving Network of autonomous automobiles
More informationGlobal convergence rate analysis of unconstrained optimization methods based on probabilistic models
Math. Program., Ser. A DOI 10.1007/s10107-017-1137-4 FULL LENGTH PAPER Global convergence rate analysis of unconstrained optimization methods based on probabilistic models C. Cartis 1 K. Scheinberg 2 Received:
More informationSOLVING ROBUST SUPPLY CHAIN PROBLEMS
SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated
More informationProbabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016
Probabilistic Meshless Methods for Bayesian Inverse Problems Jon Cockayne July 8, 2016 1 Co-Authors Chris Oates Tim Sullivan Mark Girolami 2 What is PN? Many problems in mathematics have no analytical
More informationLecture outline W.B.Powell 1
Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous
More informationPortfolio Management and Optimal Execution via Convex Optimization
Portfolio Management and Optimal Execution via Convex Optimization Enzo Busseti Stanford University April 9th, 2018 Problems portfolio management choose trades with optimization minimize risk, maximize
More informationSensitivity Analysis with Data Tables. 10% annual interest now =$110 one year later. 10% annual interest now =$121 one year later
Sensitivity Analysis with Data Tables Time Value of Money: A Special kind of Trade-Off: $100 @ 10% annual interest now =$110 one year later $110 @ 10% annual interest now =$121 one year later $100 @ 10%
More informationMarket Risk Analysis Volume I
Market Risk Analysis Volume I Quantitative Methods in Finance Carol Alexander John Wiley & Sons, Ltd List of Figures List of Tables List of Examples Foreword Preface to Volume I xiii xvi xvii xix xxiii
More informationEC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods
EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions
More informationAmath 546/Econ 589 Univariate GARCH Models
Amath 546/Econ 589 Univariate GARCH Models Eric Zivot April 24, 2013 Lecture Outline Conditional vs. Unconditional Risk Measures Empirical regularities of asset returns Engle s ARCH model Testing for ARCH
More informationFinal exam solutions
EE365 Stochastic Control / MS&E251 Stochastic Decision Models Profs. S. Lall, S. Boyd June 5 6 or June 6 7, 2013 Final exam solutions This is a 24 hour take-home final. Please turn it in to one of the
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x
More information4 Reinforcement Learning Basic Algorithms
Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems
More informationDynamic Pricing for Competing Sellers
Clemson University TigerPrints All Theses Theses 8-2015 Dynamic Pricing for Competing Sellers Liu Zhu Clemson University, liuz@clemson.edu Follow this and additional works at: https://tigerprints.clemson.edu/all_theses
More informationStatistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography
Statistical and Computational Inverse Problems with Applications Part 5B: Electrical impedance tomography Aku Seppänen Inverse Problems Group Department of Applied Physics University of Eastern Finland
More information2D penalized spline (continuous-by-continuous interaction)
2D penalized spline (continuous-by-continuous interaction) Two examples (RWC, Section 13.1): Number of scallops caught off Long Island Counts are made at specific coordinates. Incidence of AIDS in Italian
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationLecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018
Lecture 2: Making Good Sequences of Decisions Given a Model of World CS234: RL Emma Brunskill Winter 218 Human in the loop exoskeleton work from Steve Collins lab Class Structure Last Time: Introduction
More informationOptimal Portfolio Selection Under the Estimation Risk in Mean Return
Optimal Portfolio Selection Under the Estimation Risk in Mean Return by Lei Zhu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics
More information9.1 Principal Component Analysis for Portfolios
Chapter 9 Alpha Trading By the name of the strategies, an alpha trading strategy is to select and trade portfolios so the alpha is maximized. Two important mathematical objects are factor analysis and
More information"Pricing Exotic Options using Strong Convergence Properties
Fourth Oxford / Princeton Workshop on Financial Mathematics "Pricing Exotic Options using Strong Convergence Properties Klaus E. Schmitz Abe schmitz@maths.ox.ac.uk www.maths.ox.ac.uk/~schmitz Prof. Mike
More informationThe method of Maximum Likelihood.
Maximum Likelihood The method of Maximum Likelihood. In developing the least squares estimator - no mention of probabilities. Minimize the distance between the predicted linear regression and the observed
More informationRegression estimation in continuous time with a view towards pricing Bermudan options
with a view towards pricing Bermudan options Tagung des SFB 649 Ökonomisches Risiko in Motzen 04.-06.06.2009 Financial engineering in times of financial crisis Derivate... süßes Gift für die Spekulanten
More informationPortfolio selection with multiple risk measures
Portfolio selection with multiple risk measures Garud Iyengar Columbia University Industrial Engineering and Operations Research Joint work with Carlos Abad Outline Portfolio selection and risk measures
More informationA Stochastic Approximation Algorithm for Making Pricing Decisions in Network Revenue Management Problems
A Stochastic Approximation Algorithm for Making ricing Decisions in Network Revenue Management roblems Sumit Kunnumkal Indian School of Business, Gachibowli, Hyderabad, 500032, India sumit kunnumkal@isb.edu
More informationM.S. in Quantitative Finance & Risk Analytics (QFRA) Fall 2017 & Spring 2018
M.S. in Quantitative Finance & Risk Analytics (QFRA) Fall 2017 & Spring 2018 2 - Required Professional Development &Career Workshops MGMT 7770 Prof. Development Workshop 1/Career Workshops (Fall) Wed.
More informationOptimal energy management and stochastic decomposition
Optimal energy management and stochastic decomposition F. Pacaud P. Carpentier J.P. Chancelier M. De Lara JuMP-dev workshop, 2018 ENPC ParisTech ENSTA ParisTech Efficacity 1/23 Motivation We consider a
More informationIDENTIFYING BROAD AND NARROW FINANCIAL RISK FACTORS VIA CONVEX OPTIMIZATION: PART I
1 IDENTIFYING BROAD AND NARROW FINANCIAL RISK FACTORS VIA CONVEX OPTIMIZATION: PART I Lisa Goldberg lrg@berkeley.edu MMDS Workshop. June 22, 2016. joint with Alex Shkolnik and Jeff Bohn. Identifying Broad
More informationEquity correlations implied by index options: estimation and model uncertainty analysis
1/18 : estimation and model analysis, EDHEC Business School (joint work with Rama COT) Modeling and managing financial risks Paris, 10 13 January 2011 2/18 Outline 1 2 of multi-asset models Solution to
More informationSample Path Large Deviations and Optimal Importance Sampling for Stochastic Volatility Models
Sample Path Large Deviations and Optimal Importance Sampling for Stochastic Volatility Models Scott Robertson Carnegie Mellon University scottrob@andrew.cmu.edu http://www.math.cmu.edu/users/scottrob June
More informationCatalyst Acceleration for First-order Convex Optimization: from Theory to Practice
Journal of Machine Learning Research 8 (8) -54 Submitted /7; Revised /7; Published 4/8 Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice Hongzhou Lin Massachusetts Institute
More informationASSET PRICING WITH LIMITED RISK SHARING AND HETEROGENOUS AGENTS
ASSET PRICING WITH LIMITED RISK SHARING AND HETEROGENOUS AGENTS Francisco Gomes and Alexander Michaelides Roine Vestman, New York University November 27, 2007 OVERVIEW OF THE PAPER The aim of the paper
More informationParameter estimation in SDE:s
Lund University Faculty of Engineering Statistics in Finance Centre for Mathematical Sciences, Mathematical Statistics HT 2011 Parameter estimation in SDE:s This computer exercise concerns some estimation
More informationRobust Optimization Applied to a Currency Portfolio
Robust Optimization Applied to a Currency Portfolio R. Fonseca, S. Zymler, W. Wiesemann, B. Rustem Workshop on Numerical Methods and Optimization in Finance June, 2009 OUTLINE Introduction Motivation &
More informationTangent Lévy Models. Sergey Nadtochiy (joint work with René Carmona) Oxford-Man Institute of Quantitative Finance University of Oxford.
Tangent Lévy Models Sergey Nadtochiy (joint work with René Carmona) Oxford-Man Institute of Quantitative Finance University of Oxford June 24, 2010 6th World Congress of the Bachelier Finance Society Sergey
More informationRisk Measurement in Credit Portfolio Models
9 th DGVFM Scientific Day 30 April 2010 1 Risk Measurement in Credit Portfolio Models 9 th DGVFM Scientific Day 30 April 2010 9 th DGVFM Scientific Day 30 April 2010 2 Quantitative Risk Management Profit
More informationMultilevel quasi-monte Carlo path simulation
Multilevel quasi-monte Carlo path simulation Michael B. Giles and Ben J. Waterhouse Lluís Antoni Jiménez Rugama January 22, 2014 Index 1 Introduction to MLMC Stochastic model Multilevel Monte Carlo Milstein
More informationContents Critique 26. portfolio optimization 32
Contents Preface vii 1 Financial problems and numerical methods 3 1.1 MATLAB environment 4 1.1.1 Why MATLAB? 5 1.2 Fixed-income securities: analysis and portfolio immunization 6 1.2.1 Basic valuation of
More informationModelling, Estimation and Hedging of Longevity Risk
IA BE Summer School 2016, K. Antonio, UvA 1 / 50 Modelling, Estimation and Hedging of Longevity Risk Katrien Antonio KU Leuven and University of Amsterdam IA BE Summer School 2016, Leuven Module II: Fitting
More informationIMPA Commodities Course : Forward Price Models
IMPA Commodities Course : Forward Price Models Sebastian Jaimungal sebastian.jaimungal@utoronto.ca Department of Statistics and Mathematical Finance Program, University of Toronto, Toronto, Canada http://www.utstat.utoronto.ca/sjaimung
More informationAn adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity
An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity Coralia Cartis, Nick Gould and Philippe Toint Department of Mathematics,
More informationChapter 6 Forecasting Volatility using Stochastic Volatility Model
Chapter 6 Forecasting Volatility using Stochastic Volatility Model Chapter 6 Forecasting Volatility using SV Model In this chapter, the empirical performance of GARCH(1,1), GARCH-KF and SV models from
More informationEvaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization
Evaluation complexity of adaptive cubic regularization methods for convex unconstrained optimization Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint October 30, 200; Revised March 30, 20 Abstract
More informationA start of Variational Methods for ERGM Ranran Wang, UW
A start of Variational Methods for ERGM Ranran Wang, UW MURI-UCI April 24, 2009 Outline A start of Variational Methods for ERGM [1] Introduction to ERGM Current methods of parameter estimation: MCMCMLE:
More informationChapter 7: Estimation Sections
1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood
More informationIntroduction to Algorithmic Trading Strategies Lecture 9
Introduction to Algorithmic Trading Strategies Lecture 9 Quantitative Equity Portfolio Management Haksun Li haksun.li@numericalmethod.com www.numericalmethod.com Outline Alpha Factor Models References
More informationTechniques for Calculating the Efficient Frontier
Techniques for Calculating the Efficient Frontier Weerachart Kilenthong RIPED, UTCC c Kilenthong 2017 Tee (Riped) Introduction 1 / 43 Two Fund Theorem The Two-Fund Theorem states that we can reach any
More informationOptimization in Finance
Research Reports on Mathematical and Computing Sciences Series B : Operations Research Department of Mathematical and Computing Sciences Tokyo Institute of Technology 2-12-1 Oh-Okayama, Meguro-ku, Tokyo
More informationOption pricing in the stochastic volatility model of Barndorff-Nielsen and Shephard
Option pricing in the stochastic volatility model of Barndorff-Nielsen and Shephard Indifference pricing and the minimal entropy martingale measure Fred Espen Benth Centre of Mathematics for Applications
More informationMarkov Decision Processes
Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use
More information2D5362 Machine Learning
2D5362 Machine Learning Reinforcement Learning MIT GALib Available at http://lancet.mit.edu/ga/ download galib245.tar.gz gunzip galib245.tar.gz tar xvf galib245.tar cd galib245 make or access my files
More informationCSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems
CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems January 26, 2018 1 / 24 Basic information All information is available in the syllabus
More informationD I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018
D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING Arnoud V. den Boer University of Amsterdam N. Bora Keskin Duke University Rotterdam May 24, 2018 Dynamic pricing and learning: Learning
More informationHigh Dimensional Bayesian Optimisation and Bandits via Additive Models
1/20 High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos ICML 15 July 8 2015 2/20 Bandits & Optimisation Maximum Likelihood inference
More informationEuropean option pricing under parameter uncertainty
European option pricing under parameter uncertainty Martin Jönsson (joint work with Samuel Cohen) University of Oxford Workshop on BSDEs, SPDEs and their Applications July 4, 2017 Introduction 2/29 Introduction
More informationInterior-Point Algorithm for CLP II. yyye
Conic Linear Optimization and Appl. Lecture Note #10 1 Interior-Point Algorithm for CLP II Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/
More informationStrategies and Nash Equilibrium. A Whirlwind Tour of Game Theory
Strategies and Nash Equilibrium A Whirlwind Tour of Game Theory (Mostly from Fudenberg & Tirole) Players choose actions, receive rewards based on their own actions and those of the other players. Example,
More informationAdaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity
Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Coralia Cartis,, Nicholas I. M. Gould, and Philippe L. Toint September
More information1 Explicit Euler Scheme (or Euler Forward Scheme )
Numerical methods for PDE in Finance - M2MO - Paris Diderot American options January 2018 Files: https://ljll.math.upmc.fr/bokanowski/enseignement/2017/m2mo/m2mo.html We look for a numerical approximation
More informationA Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation
A Stochastic Levenberg-Marquardt Method Using Random Models with Application to Data Assimilation E Bergou Y Diouane V Kungurtsev C W Royer July 5, 08 Abstract Globally convergent variants of the Gauss-Newton
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationHigh Volatility Medium Volatility /24/85 12/18/86
Estimating Model Limitation in Financial Markets Malik Magdon-Ismail 1, Alexander Nicholson 2 and Yaser Abu-Mostafa 3 1 malik@work.caltech.edu 2 zander@work.caltech.edu 3 yaser@caltech.edu Learning Systems
More informationOptimum Thresholding for Semimartingales with Lévy Jumps under the mean-square error
Optimum Thresholding for Semimartingales with Lévy Jumps under the mean-square error José E. Figueroa-López Department of Mathematics Washington University in St. Louis Spring Central Sectional Meeting
More informationBudget Management In GSP (2018)
Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning
More informationRichardson Extrapolation Techniques for the Pricing of American-style Options
Richardson Extrapolation Techniques for the Pricing of American-style Options June 1, 2005 Abstract Richardson Extrapolation Techniques for the Pricing of American-style Options In this paper we re-examine
More informationNumerical Methods in Option Pricing (Part III)
Numerical Methods in Option Pricing (Part III) E. Explicit Finite Differences. Use of the Forward, Central, and Symmetric Central a. In order to obtain an explicit solution for the price of the derivative,
More informationProject 1: Double Pendulum
Final Projects Introduction to Numerical Analysis II http://www.math.ucsb.edu/ atzberg/winter2009numericalanalysis/index.html Professor: Paul J. Atzberger Due: Friday, March 20th Turn in to TA s Mailbox:
More informationValuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility model
Valuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility model 1(23) Valuing volatility and variance swaps for a non-gaussian Ornstein-Uhlenbeck stochastic volatility
More information1 Explicit Euler Scheme (or Euler Forward Scheme )
Numerical methods for PDE in Finance - M2MO - Paris Diderot American options January 2017 Files: https://ljll.math.upmc.fr/bokanowski/enseignement/2016/m2mo/m2mo.html We look for a numerical approximation
More informationRandomness and Fractals
Randomness and Fractals Why do so many physicists become traders? Gregory F. Lawler Department of Mathematics Department of Statistics University of Chicago September 25, 2011 1 / 24 Mathematics and the
More informationRisk Management for Chemical Supply Chain Planning under Uncertainty
for Chemical Supply Chain Planning under Uncertainty Fengqi You and Ignacio E. Grossmann Dept. of Chemical Engineering, Carnegie Mellon University John M. Wassick The Dow Chemical Company Introduction
More informationStability in geometric & functional inequalities
Stability in geometric & functional inequalities A. Figalli The University of Texas at Austin www.ma.utexas.edu/users/figalli/ Alessio Figalli (UT Austin) Stability in geom. & funct. ineq. Krakow, July
More informationA simple wealth model
Quantitative Macroeconomics Raül Santaeulàlia-Llopis, MOVE-UAB and Barcelona GSE Homework 5, due Thu Nov 1 I A simple wealth model Consider the sequential problem of a household that maximizes over streams
More information