Machine Learning (CSE 446): Learning as Minimizing Loss

Size: px
Start display at page:

Download "Machine Learning (CSE 446): Learning as Minimizing Loss"

Transcription

1 Machine Learning (CSE 446): Learning as Minimizing Loss oah Smith c 207 University of Washington nasmith@cs.washington.edu October 23, 207 / 2

2 Sorry! o office hour for me today. Wednesday is as usual. 2 / 2

3 Perceptron A model and an algorithm, rolled into one. Model: f(x) = sign(w x + b), known as linear, visualized by a (hopefully) separating hyperplane in feature-space. Algorithm: PerceptronTrain, an error-driven, iterative updating algorithm. 3 / 2

4 A Different View of PerceptronTrain: Optimization Minimize training-set error rate : loss y n (w x + b) 0 } {{ } ɛ train zero-one loss margin = y (w x + b) 4 / 2

5 A Different View of PerceptronTrain: Optimization Minimize training-set error rate : y n (w x + b) 0 } {{ } ɛ train zero-one loss This problem is P-hard; even solving it approximately (i.e., getting within a small constant factor of the optimal value) is P-hard! loss margin = y (w x + b) 5 / 2

6 A Different View of PerceptronTrain: Optimization loss Minimize training-set error rate : What the perceptron does: y n (w x + b) 0 } {{ } ɛ train zero-one loss margin = y (w x + b) loss max( y n (w x + b), 0) }{{} perceptron loss margin = y (w x + b) 6 / 2

7 A Different View of PerceptronTrain: Optimization Minimize training-set error rate : What the perceptron does: y n (w x + b) 0 } {{ } ɛ train zero-one loss max( y n (w x + b), 0) }{{} perceptron loss 7 / 2

8 A Different View of PerceptronTrain: Optimization Minimize training-set error rate : What the perceptron does: y n (w x + b) 0 } {{ } ɛ train zero-one loss max( y n (w x + b), 0) }{{} perceptron loss 8 / 2

9 Squash (Sigmoid) Loss? loss margin = y (w x + b) 9 / 2

10 Different Kinds of Objective Functions Continuous (perceptron loss, squash loss) vs. discrete (zero-one loss) loss margin = y (w x + b) 0 / 2

11 Different Kinds of Objective Functions Continuous (perceptron loss, squash loss) vs. discrete (zero-one loss) Convex (perceptron loss) vs. nonconvex (zero-one loss, squash loss) loss margin = y (w x + b) / 2

12 Different Kinds of Objective Functions Continuous (perceptron loss, squash loss) vs. discrete (zero-one loss) Convex (perceptron loss) vs. nonconvex (zero-one loss, squash loss) Differentiable (squash loss) vs. nondifferentiable (zero-one loss, perceptron loss) loss margin = y (w x + b) 2 / 2

13 Different Kinds of Objective Functions Continuous (perceptron loss, squash loss) vs. discrete (zero-one loss) (The sum of two continuous functions is also continuous.) Convex (perceptron loss) vs. nonconvex (zero-one loss, squash loss) (The sum of two convex functions is also convex.) Differentiable (squash loss) vs. nondifferentiable (zero-one loss, perceptron loss) (The sum of two differentiable functions is also differentiable.) loss margin = y (w x + b) 3 / 2

14 Regularization Choose your loss function L. To fit the training data: L (y n (w x n + b)) 4 / 2

15 Regularization Choose your loss function L. To fit the training data: L (y n (w x n + b)) + R(w, b) Regularization: add a penalty to the objective function to encourage generalization. 5 / 2

16 Regularization Choose your loss function L. To fit the training data: L (y n (w x n + b)) + R(w, b) Regularization: add a penalty to the objective function to encourage generalization. Most common: R(w, b) = λ w / 2

17 Regularization Choose your loss function L. To fit the training data: L (y n (w x n + b)) + R(w, b) Regularization: add a penalty to the objective function to encourage generalization. Most common: R(w, b) = λ w 2 2. ote that this term is convex and differentiable. 7 / 2

18 Your new hobby: blindfolded mountain escape 8 / 2

19 Convex Optimization 0 Assume we are imizing a function F : R d R that is continuous, convex, and differentiable with respect to its input, z. F (z) z At a given point z 0, the direction of steepest descent is the negative gradient: F z[] (z 0) F z[2] (z 0) ote that g : R d R d. g(z 0 ) = z F (z 0 ) =. F z[d] (z 0) 9 / 2

20 Gradient Descent Data: function F : R d R, number of iterations K, step sizes η (),..., η (K) Result: z R d initialize: z (0) = 0; for k {,..., K} do g (k) = z F (z (k ) ); z (k) = z (k ) η (k) g (k) ; end return z (K) ; Algorithm : GradientDescent 20 / 2

21 Gradient Descent 2 / 2

Support Vector Machines: Training with Stochastic Gradient Descent

Support Vector Machines: Training with Stochastic Gradient Descent Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Learning from Data: Learning Logistic Regressors

Learning from Data: Learning Logistic Regressors Learning from Data: Learning Logistic Regressors November 1, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Learning Logistic Regressors P(t x) = σ(w T x + b). Want to learn w and b using training data. As before:

More information

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016 AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex

More information

Trust Region Methods for Unconstrained Optimisation

Trust Region Methods for Unconstrained Optimisation Trust Region Methods for Unconstrained Optimisation Lecture 9, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Trust

More information

An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity

An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity Coralia Cartis, Nick Gould and Philippe Toint Department of Mathematics,

More information

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0.

Outline. 1 Introduction. 2 Algorithms. 3 Examples. Algorithm 1 General coordinate minimization framework. 1: Choose x 0 R n and set k 0. Outline Coordinate Minimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University November 27, 208 Introduction 2 Algorithms Cyclic order with exact minimization

More information

Approximate Composite Minimization: Convergence Rates and Examples

Approximate Composite Minimization: Convergence Rates and Examples ISMP 2018 - Bordeaux Approximate Composite Minimization: Convergence Rates and S. Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi MLO Lab, EPFL, Switzerland sebastian.stich@epfl.ch July 4, 2018

More information

Machine Learning (CSE 446): Pratical issues: optimization and learning

Machine Learning (CSE 446): Pratical issues: optimization and learning Machine Learning (CSE 446): Pratical issues: optimization and learning John Thickstun guest lecture c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 10 Review 1 / 10 Our running example

More information

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization for Strongly Convex Stochastic Optimization Microsoft Research New England NIPS 2011 Optimization Workshop Stochastic Convex Optimization Setting Goal: Optimize convex function F ( ) over convex domain

More information

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems January 26, 2018 1 / 24 Basic information All information is available in the syllabus

More information

Is Greedy Coordinate Descent a Terrible Algorithm?

Is Greedy Coordinate Descent a Terrible Algorithm? Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random

More information

Large-Scale SVM Optimization: Taking a Machine Learning Perspective

Large-Scale SVM Optimization: Taking a Machine Learning Perspective Large-Scale SVM Optimization: Taking a Machine Learning Perspective Shai Shalev-Shwartz Toyota Technological Institute at Chicago Joint work with Nati Srebro Talk at NEC Labs, Princeton, August, 2008 Shai

More information

IE 495 Lecture 11. The LShaped Method. Prof. Jeff Linderoth. February 19, February 19, 2003 Stochastic Programming Lecture 11 Slide 1

IE 495 Lecture 11. The LShaped Method. Prof. Jeff Linderoth. February 19, February 19, 2003 Stochastic Programming Lecture 11 Slide 1 IE 495 Lecture 11 The LShaped Method Prof. Jeff Linderoth February 19, 2003 February 19, 2003 Stochastic Programming Lecture 11 Slide 1 Before We Begin HW#2 $300 $0 http://www.unizh.ch/ior/pages/deutsch/mitglieder/kall/bib/ka-wal-94.pdf

More information

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem.

Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Exercise List: Proving convergence of the (Stochastic) Gradient Descent Method for the Least Squares Problem. Robert M. Gower. October 3, 07 Introduction This is an exercise in proving the convergence

More information

Sensitivity Analysis with Data Tables. 10% annual interest now =$110 one year later. 10% annual interest now =$121 one year later

Sensitivity Analysis with Data Tables. 10% annual interest now =$110 one year later. 10% annual interest now =$121 one year later Sensitivity Analysis with Data Tables Time Value of Money: A Special kind of Trade-Off: $100 @ 10% annual interest now =$110 one year later $110 @ 10% annual interest now =$121 one year later $100 @ 10%

More information

The Irrevocable Multi-Armed Bandit Problem

The Irrevocable Multi-Armed Bandit Problem The Irrevocable Multi-Armed Bandit Problem Ritesh Madan Qualcomm-Flarion Technologies May 27, 2009 Joint work with Vivek Farias (MIT) 2 Multi-Armed Bandit Problem n arms, where each arm i is a Markov Decision

More information

Contents Critique 26. portfolio optimization 32

Contents Critique 26. portfolio optimization 32 Contents Preface vii 1 Financial problems and numerical methods 3 1.1 MATLAB environment 4 1.1.1 Why MATLAB? 5 1.2 Fixed-income securities: analysis and portfolio immunization 6 1.2.1 Basic valuation of

More information

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE392o, Stanford University

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE392o, Stanford University Ellipsoid Method ellipsoid method convergence proof inequality constraints feasibility problems Prof. S. Boyd, EE392o, Stanford University Challenges in cutting-plane methods can be difficult to compute

More information

Portfolio selection with multiple risk measures

Portfolio selection with multiple risk measures Portfolio selection with multiple risk measures Garud Iyengar Columbia University Industrial Engineering and Operations Research Joint work with Carlos Abad Outline Portfolio selection and risk measures

More information

What can we do with numerical optimization?

What can we do with numerical optimization? Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016

More information

Decomposition Methods

Decomposition Methods Decomposition Methods separable problems, complicating variables primal decomposition dual decomposition complicating constraints general decomposition structures Prof. S. Boyd, EE364b, Stanford University

More information

k-layer neural networks: High capacity scoring functions + tips on how to train them

k-layer neural networks: High capacity scoring functions + tips on how to train them k-layer neural networks: High capacity scoring functions + tips on how to train them A new class of scoring functions Linear scoring function s = W x + b 2-layer Neural Network s 1 = W 1 x + b 1 h = max(0,

More information

COMPARING NEURAL NETWORK AND REGRESSION MODELS IN ASSET PRICING MODEL WITH HETEROGENEOUS BELIEFS

COMPARING NEURAL NETWORK AND REGRESSION MODELS IN ASSET PRICING MODEL WITH HETEROGENEOUS BELIEFS Akademie ved Leske republiky Ustav teorie informace a automatizace Academy of Sciences of the Czech Republic Institute of Information Theory and Automation RESEARCH REPORT JIRI KRTEK COMPARING NEURAL NETWORK

More information

Steepest descent and conjugate gradient methods with variable preconditioning

Steepest descent and conjugate gradient methods with variable preconditioning Ilya Lashuk and Andrew Knyazev 1 Steepest descent and conjugate gradient methods with variable preconditioning Ilya Lashuk (the speaker) and Andrew Knyazev Department of Mathematics and Center for Computational

More information

Statistics and Machine Learning Homework1

Statistics and Machine Learning Homework1 Statistics and Machine Learning Homework1 Yuh-Jye Lee National Taiwan University of Science and Technology dmlab1.csie.ntust.edu.tw/leepage/index c.htm Exercise 1: (a) Solve 1 min x R 2 2 xt 1 0 0 900

More information

Barapatre Omprakash et.al; International Journal of Advance Research, Ideas and Innovations in Technology

Barapatre Omprakash et.al; International Journal of Advance Research, Ideas and Innovations in Technology ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 2) Available online at: www.ijariit.com Stock Price Prediction using Artificial Neural Network Omprakash Barapatre omprakashbarapatre@bitraipur.ac.in

More information

CPS 270: Artificial Intelligence Markov decision processes, POMDPs

CPS 270: Artificial Intelligence  Markov decision processes, POMDPs CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Markov decision processes, POMDPs Instructor: Vincent Conitzer Warmup: a Markov process with rewards We derive some reward

More information

2D5362 Machine Learning

2D5362 Machine Learning 2D5362 Machine Learning Reinforcement Learning MIT GALib Available at http://lancet.mit.edu/ga/ download galib245.tar.gz gunzip galib245.tar.gz tar xvf galib245.tar cd galib245 make or access my files

More information

Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger

Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger Final Projects Introduction to Numerical Analysis Professor: Paul J. Atzberger Due Date: Friday, December 12th Instructions: In the final project you are to apply the numerical methods developed in the

More information

ECON 815. A Basic New Keynesian Model II

ECON 815. A Basic New Keynesian Model II ECON 815 A Basic New Keynesian Model II Winter 2015 Queen s University ECON 815 1 Unemployment vs. Inflation 12 10 Unemployment 8 6 4 2 0 1 1.5 2 2.5 3 3.5 4 4.5 5 Core Inflation 14 12 10 Unemployment

More information

Convergence of trust-region methods based on probabilistic models

Convergence of trust-region methods based on probabilistic models Convergence of trust-region methods based on probabilistic models A. S. Bandeira K. Scheinberg L. N. Vicente October 24, 2013 Abstract In this paper we consider the use of probabilistic or random models

More information

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models Math. Program., Ser. A DOI 10.1007/s10107-017-1137-4 FULL LENGTH PAPER Global convergence rate analysis of unconstrained optimization methods based on probabilistic models C. Cartis 1 K. Scheinberg 2 Received:

More information

Risk Management for Chemical Supply Chain Planning under Uncertainty

Risk Management for Chemical Supply Chain Planning under Uncertainty for Chemical Supply Chain Planning under Uncertainty Fengqi You and Ignacio E. Grossmann Dept. of Chemical Engineering, Carnegie Mellon University John M. Wassick The Dow Chemical Company Introduction

More information

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS

GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS GLOBAL CONVERGENCE OF GENERAL DERIVATIVE-FREE TRUST-REGION ALGORITHMS TO FIRST AND SECOND ORDER CRITICAL POINTS ANDREW R. CONN, KATYA SCHEINBERG, AND LUíS N. VICENTE Abstract. In this paper we prove global

More information

Portfolio Management and Optimal Execution via Convex Optimization

Portfolio Management and Optimal Execution via Convex Optimization Portfolio Management and Optimal Execution via Convex Optimization Enzo Busseti Stanford University April 9th, 2018 Problems portfolio management choose trades with optimization minimize risk, maximize

More information

Definition 4.1. In a stochastic process T is called a stopping time if you can tell when it happens.

Definition 4.1. In a stochastic process T is called a stopping time if you can tell when it happens. 102 OPTIMAL STOPPING TIME 4. Optimal Stopping Time 4.1. Definitions. On the first day I explained the basic problem using one example in the book. On the second day I explained how the solution to the

More information

Budget Management In GSP (2018)

Budget Management In GSP (2018) Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Econ Homework 4 - Answers ECONOMIC APPLICATIONS OF CONSTRAINED OPTIMIZATION. 1. Assume that a rm produces product x using k and l, where

Econ Homework 4 - Answers ECONOMIC APPLICATIONS OF CONSTRAINED OPTIMIZATION. 1. Assume that a rm produces product x using k and l, where Econ 4808 - Homework 4 - Answers ECONOMIC APPLICATIONS OF CONSTRAINED OPTIMIZATION Graded questions: : A points; B - point; C - point : B points : B points. Assume that a rm produces product x using k

More information

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

Catalyst Acceleration for Gradient-Based Non-Convex Optimization Catalyst Acceleration for Gradient-Based Non-Convex Optimization Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui To cite this version: Courtney Paquette, Hongzhou Lin,

More information

Golden-Section Search for Optimization in One Dimension

Golden-Section Search for Optimization in One Dimension Golden-Section Search for Optimization in One Dimension Golden-section search for maximization (or minimization) is similar to the bisection method for root finding. That is, it does not use the derivatives

More information

Log-Robust Portfolio Management

Log-Robust Portfolio Management Log-Robust Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Elcin Cetinkaya and Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983 Dr.

More information

Portfolio replication with sparse regression

Portfolio replication with sparse regression Portfolio replication with sparse regression Akshay Kothkari, Albert Lai and Jason Morton December 12, 2008 Suppose an investor (such as a hedge fund or fund-of-fund) holds a secret portfolio of assets,

More information

$tock Forecasting using Machine Learning

$tock Forecasting using Machine Learning $tock Forecasting using Machine Learning Greg Colvin, Garrett Hemann, and Simon Kalouche Abstract We present an implementation of 3 different machine learning algorithms gradient descent, support vector

More information

Equity correlations implied by index options: estimation and model uncertainty analysis

Equity correlations implied by index options: estimation and model uncertainty analysis 1/18 : estimation and model analysis, EDHEC Business School (joint work with Rama COT) Modeling and managing financial risks Paris, 10 13 January 2011 2/18 Outline 1 2 of multi-asset models Solution to

More information

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE364b, Stanford University

Ellipsoid Method. ellipsoid method. convergence proof. inequality constraints. feasibility problems. Prof. S. Boyd, EE364b, Stanford University Ellipsoid Method ellipsoid method convergence proof inequality constraints feasibility problems Prof. S. Boyd, EE364b, Stanford University Ellipsoid method developed by Shor, Nemirovsky, Yudin in 1970s

More information

Price Impact and Optimal Execution Strategy

Price Impact and Optimal Execution Strategy OXFORD MAN INSTITUE, UNIVERSITY OF OXFORD SUMMER RESEARCH PROJECT Price Impact and Optimal Execution Strategy Bingqing Liu Supervised by Stephen Roberts and Dieter Hendricks Abstract Price impact refers

More information

Allocation of Risk Capital via Intra-Firm Trading

Allocation of Risk Capital via Intra-Firm Trading Allocation of Risk Capital via Intra-Firm Trading Sean Hilden Department of Mathematical Sciences Carnegie Mellon University December 5, 2005 References 1. Artzner, Delbaen, Eber, Heath: Coherent Measures

More information

Maximizing of Portfolio Performance

Maximizing of Portfolio Performance Maximizing of Portfolio Performance PEKÁR Juraj, BREZINA Ivan, ČIČKOVÁ Zuzana Department of Operations Research and Econometrics, University of Economics, Bratislava, Slovakia Outline Problem of portfolio

More information

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs

Financial Optimization ISE 347/447. Lecture 15. Dr. Ted Ralphs Financial Optimization ISE 347/447 Lecture 15 Dr. Ted Ralphs ISE 347/447 Lecture 15 1 Reading for This Lecture C&T Chapter 12 ISE 347/447 Lecture 15 2 Stock Market Indices A stock market index is a statistic

More information

ALGORITHMIC TRADING STRATEGIES IN PYTHON

ALGORITHMIC TRADING STRATEGIES IN PYTHON 7-Course Bundle In ALGORITHMIC TRADING STRATEGIES IN PYTHON Learn to use 15+ trading strategies including Statistical Arbitrage, Machine Learning, Quantitative techniques, Forex valuation methods, Options

More information

Exercise 1 Modelling and Convexity

Exercise 1 Modelling and Convexity TMA947 / MMG621 Nonlinear optimization Exercise 1 Modelling and Convexity Emil Gustavsson, Michael Patriksson, Adam Wojciechowski, Zuzana Šabartová September 16, 2014 E1.1 (easy) To produce a g. of cookies

More information

Scaling SGD Batch Size to 32K for ImageNet Training

Scaling SGD Batch Size to 32K for ImageNet Training Scaling SGD Batch Size to 32K for ImageNet Training Yang You Computer Science Division of UC Berkeley youyang@cs.berkeley.edu Yang You (youyang@cs.berkeley.edu) 32K SGD Batch Size CS Division of UC Berkeley

More information

On the complexity of the steepest-descent with exact linesearches

On the complexity of the steepest-descent with exact linesearches On the complexity of the steepest-descent with exact linesearches Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint 9 September 22 Abstract The worst-case complexity of the steepest-descent algorithm

More information

Column generation to solve planning problems

Column generation to solve planning problems Column generation to solve planning problems ALGORITMe Han Hoogeveen 1 Continuous Knapsack problem We are given n items with integral weight a j ; integral value c j. B is a given integer. Goal: Find a

More information

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity

Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Adaptive cubic regularisation methods for unconstrained optimization. Part II: worst-case function- and derivative-evaluation complexity Coralia Cartis,, Nicholas I. M. Gould, and Philippe L. Toint September

More information

Penalty Functions. The Premise Quadratic Loss Problems and Solutions

Penalty Functions. The Premise Quadratic Loss Problems and Solutions Penalty Functions The Premise Quadratic Loss Problems and Solutions The Premise You may have noticed that the addition of constraints to an optimization problem has the effect of making it much more difficult.

More information

Journal of Internet Banking and Commerce

Journal of Internet Banking and Commerce Journal of Internet Banking and Commerce An open access Internet journal (http://www.icommercecentral.com) Journal of Internet Banking and Commerce, December 2017, vol. 22, no. 3 STOCK PRICE PREDICTION

More information

Robust Dual Dynamic Programming

Robust Dual Dynamic Programming 1 / 18 Robust Dual Dynamic Programming Angelos Georghiou, Angelos Tsoukalas, Wolfram Wiesemann American University of Beirut Olayan School of Business 31 May 217 2 / 18 Inspired by SDDP Stochastic optimization

More information

Adaptive cubic overestimation methods for unconstrained optimization

Adaptive cubic overestimation methods for unconstrained optimization Report no. NA-07/20 Adaptive cubic overestimation methods for unconstrained optimization Coralia Cartis School of Mathematics, University of Edinburgh, The King s Buildings, Edinburgh, EH9 3JZ, Scotland,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x

More information

INTRODUCTION TO MODERN PORTFOLIO OPTIMIZATION

INTRODUCTION TO MODERN PORTFOLIO OPTIMIZATION INTRODUCTION TO MODERN PORTFOLIO OPTIMIZATION Abstract. This is the rst part in my tutorial series- Follow me to Optimization Problems. In this tutorial, I will touch on the basic concepts of portfolio

More information

Randomized Full Waveform Inversion

Randomized Full Waveform Inversion Consortium 2010 Randomized Full Waveform Inversion Peyman P. Moghaddam SLIM University of British Columbia Motivation Cost of the FWI is propor?onal to the number of shots and it requires hundreds of RTM

More information

Problem Set 1 Answer Key. I. Short Problems 1. Check whether the following three functions represent the same underlying preferences

Problem Set 1 Answer Key. I. Short Problems 1. Check whether the following three functions represent the same underlying preferences Problem Set Answer Key I. Short Problems. Check whether the following three functions represent the same underlying preferences u (q ; q ) = q = + q = u (q ; q ) = q + q u (q ; q ) = ln q + ln q All three

More information

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee

CS 3331 Numerical Methods Lecture 2: Functions of One Variable. Cherung Lee CS 3331 Numerical Methods Lecture 2: Functions of One Variable Cherung Lee Outline Introduction Solving nonlinear equations: find x such that f(x ) = 0. Binary search methods: (Bisection, regula falsi)

More information

Solving real-life portfolio problem using stochastic programming and Monte-Carlo techniques

Solving real-life portfolio problem using stochastic programming and Monte-Carlo techniques Solving real-life portfolio problem using stochastic programming and Monte-Carlo techniques 1 Introduction Martin Branda 1 Abstract. We deal with real-life portfolio problem with Value at Risk, transaction

More information

Graphs Details Math Examples Using data Tax example. Decision. Intermediate Micro. Lecture 5. Chapter 5 of Varian

Graphs Details Math Examples Using data Tax example. Decision. Intermediate Micro. Lecture 5. Chapter 5 of Varian Decision Intermediate Micro Lecture 5 Chapter 5 of Varian Decision-making Now have tools to model decision-making Set of options At-least-as-good sets Mathematical tools to calculate exact answer Problem

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 15: Tree-based Algorithms Cho-Jui Hsieh UC Davis March 7, 2018 Outline Decision Tree Random Forest Gradient Boosted Decision Tree (GBDT) Decision Tree Each node checks

More information

A Local Search Algorithm for the Witsenhausen s Counterexample

A Local Search Algorithm for the Witsenhausen s Counterexample 27 IEEE 56th Annual Conference on Decision and Control (CDC) December 2-5, 27, Melbourne, Australia A Local Search Algorithm for the Witsenhausen s Counterexample Shih-Hao Tseng and Ao Tang Abstract We

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18,   ISSN Volume XII, Issue II, Feb. 18, www.ijcea.com ISSN 31-3469 AN INVESTIGATION OF FINANCIAL TIME SERIES PREDICTION USING BACK PROPAGATION NEURAL NETWORKS K. Jayanthi, Dr. K. Suresh 1 Department of Computer

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

Lecture 1: The market and consumer theory. Intermediate microeconomics Jonas Vlachos Stockholms universitet

Lecture 1: The market and consumer theory. Intermediate microeconomics Jonas Vlachos Stockholms universitet Lecture 1: The market and consumer theory Intermediate microeconomics Jonas Vlachos Stockholms universitet 1 The market Demand Supply Equilibrium Comparative statics Elasticities 2 Demand Demand function.

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}

More information

Efficient Market Making via Convex Optimization, and a Connection to Online Learning

Efficient Market Making via Convex Optimization, and a Connection to Online Learning Efficient Market Making via Convex Optimization, and a Connection to Online Learning by J. Abernethy, Y. Chen and J.W. Vaughan Presented by J. Duraj and D. Rishi 1 / 16 Outline 1 Motivation 2 Reasonable

More information

Lecture outline W.B. Powell 1

Lecture outline W.B. Powell 1 Lecture outline Applications of the newsvendor problem The newsvendor problem Estimating the distribution and censored demands The newsvendor problem and risk The newsvendor problem with an unknown distribution

More information

Optimal Portfolio Selection Under the Estimation Risk in Mean Return

Optimal Portfolio Selection Under the Estimation Risk in Mean Return Optimal Portfolio Selection Under the Estimation Risk in Mean Return by Lei Zhu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics

More information

Stochastic Proximal Algorithms with Applications to Online Image Recovery

Stochastic Proximal Algorithms with Applications to Online Image Recovery 1/24 Stochastic Proximal Algorithms with Applications to Online Image Recovery Patrick Louis Combettes 1 and Jean-Christophe Pesquet 2 1 Mathematics Department, North Carolina State University, Raleigh,

More information

MATH 4512 Fundamentals of Mathematical Finance

MATH 4512 Fundamentals of Mathematical Finance MATH 4512 Fundamentals of Mathematical Finance Solution to Homework One Course instructor: Prof. Y.K. Kwok 1. Recall that D = 1 B n i=1 c i i (1 + y) i m (cash flow c i occurs at time i m years), where

More information

The Correlation Smile Recovery

The Correlation Smile Recovery Fortis Bank Equity & Credit Derivatives Quantitative Research The Correlation Smile Recovery E. Vandenbrande, A. Vandendorpe, Y. Nesterov, P. Van Dooren draft version : March 2, 2009 1 Introduction Pricing

More information

Convex-Cardinality Problems

Convex-Cardinality Problems l 1 -norm Methods for Convex-Cardinality Problems problems involving cardinality the l 1 -norm heuristic convex relaxation and convex envelope interpretations examples recent results Prof. S. Boyd, EE364b,

More information

Gradient Descent and the Structure of Neural Network Cost Functions. presentation by Ian Goodfellow

Gradient Descent and the Structure of Neural Network Cost Functions. presentation by Ian Goodfellow Gradient Descent and the Structure of Neural Network Cost Functions presentation by Ian Goodfellow adapted for www.deeplearningbook.org from a presentation to the CIFAR Deep Learning summer school on August

More information

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults Kevin Rowland Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218, USA krowlan3@jhu.edu Edward Schembor Johns

More information

FINANCE THEORY: Intertemporal. and Optimal Firm Investment Decisions. Eric Zivot Econ 422 Summer R.W.Parks/E. Zivot ECON 422:Fisher 1.

FINANCE THEORY: Intertemporal. and Optimal Firm Investment Decisions. Eric Zivot Econ 422 Summer R.W.Parks/E. Zivot ECON 422:Fisher 1. FINANCE THEORY: Intertemporal Consumption-Saving and Optimal Firm Investment Decisions Eric Zivot Econ 422 Summer 21 ECON 422:Fisher 1 Reading PCBR, Chapter 1 (general overview of financial decision making)

More information

Government debt. Lecture 9, ECON Tord Krogh. September 10, Tord Krogh () ECON 4310 September 10, / 55

Government debt. Lecture 9, ECON Tord Krogh. September 10, Tord Krogh () ECON 4310 September 10, / 55 Government debt Lecture 9, ECON 4310 Tord Krogh September 10, 2013 Tord Krogh () ECON 4310 September 10, 2013 1 / 55 Today s lecture Topics: Basic concepts Tax smoothing Debt crisis Sovereign risk Tord

More information

Stochastic Approximation Algorithms and Applications

Stochastic Approximation Algorithms and Applications Harold J. Kushner G. George Yin Stochastic Approximation Algorithms and Applications With 24 Figures Springer Contents Preface and Introduction xiii 1 Introduction: Applications and Issues 1 1.0 Outline

More information

First-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016

First-Order Methods. Stephen J. Wright 1. University of Wisconsin-Madison. IMA, August 2016 First-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) First-Order Methods IMA, August 2016 1 / 48 Smooth

More information

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization

A Trust Region Algorithm for Heterogeneous Multiobjective Optimization A Trust Region Algorithm for Heterogeneous Multiobjective Optimization Jana Thomann and Gabriele Eichfelder 8.0.018 Abstract This paper presents a new trust region method for multiobjective heterogeneous

More information

The homework is due on Wednesday, September 7. Each questions is worth 0.8 points. No partial credits.

The homework is due on Wednesday, September 7. Each questions is worth 0.8 points. No partial credits. Homework : Econ500 Fall, 0 The homework is due on Wednesday, September 7. Each questions is worth 0. points. No partial credits. For the graphic arguments, use the graphing paper that is attached. Clearly

More information

Strategies and Nash Equilibrium. A Whirlwind Tour of Game Theory

Strategies and Nash Equilibrium. A Whirlwind Tour of Game Theory Strategies and Nash Equilibrium A Whirlwind Tour of Game Theory (Mostly from Fudenberg & Tirole) Players choose actions, receive rewards based on their own actions and those of the other players. Example,

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning MDP March May, 2013 MDP MDP: S, A, P, R, γ, µ State can be partially observable: Partially Observable MDPs () Actions can be temporally extended: Semi MDPs (SMDPs) and Hierarchical

More information

The Neoclassical Growth Model

The Neoclassical Growth Model The Neoclassical Growth Model 1 Setup Three goods: Final output Capital Labour One household, with preferences β t u (c t ) (Later we will introduce preferences with respect to labour/leisure) Endowment

More information

Predictive Model Learning of Stochastic Simulations. John Hegstrom, FSA, MAAA

Predictive Model Learning of Stochastic Simulations. John Hegstrom, FSA, MAAA Predictive Model Learning of Stochastic Simulations John Hegstrom, FSA, MAAA Table of Contents Executive Summary... 3 Choice of Predictive Modeling Techniques... 4 Neural Network Basics... 4 Financial

More information

Probability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur

Probability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur Probability and Stochastics for finance-ii Prof. Joydeep Dutta Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur Lecture - 07 Mean-Variance Portfolio Optimization (Part-II)

More information

JEFF MACKIE-MASON. x is a random variable with prior distrib known to both principal and agent, and the distribution depends on agent effort e

JEFF MACKIE-MASON. x is a random variable with prior distrib known to both principal and agent, and the distribution depends on agent effort e BASE (SYMMETRIC INFORMATION) MODEL FOR CONTRACT THEORY JEFF MACKIE-MASON 1. Preliminaries Principal and agent enter a relationship. Assume: They have access to the same information (including agent effort)

More information

Integer Programming Models

Integer Programming Models Integer Programming Models Fabio Furini December 10, 2014 Integer Programming Models 1 Outline 1 Combinatorial Auctions 2 The Lockbox Problem 3 Constructing an Index Fund Integer Programming Models 2 Integer

More information

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018

Universal regularization methods varying the power, the smoothness and the accuracy arxiv: v1 [math.oc] 16 Nov 2018 Universal regularization methods varying the power, the smoothness and the accuracy arxiv:1811.07057v1 [math.oc] 16 Nov 2018 Coralia Cartis, Nicholas I. M. Gould and Philippe L. Toint Revision completed

More information

2 Gilli and Këllezi Value at Risk (VaR), expected shortfall, mean absolute deviation, semivariance etc. are employed, leading to problems that can not

2 Gilli and Këllezi Value at Risk (VaR), expected shortfall, mean absolute deviation, semivariance etc. are employed, leading to problems that can not Heuristic Approaches for Portfolio Optimization y Manfred Gilli (manfred.gilli@metri.unige.ch) Department of Econometrics, University of Geneva, 1211 Geneva 4, Switzerland. Evis Këllezi (evis.kellezi@metri.unige.ch)

More information

Microeconomics of Banking: Lecture 2

Microeconomics of Banking: Lecture 2 Microeconomics of Banking: Lecture 2 Prof. Ronaldo CARPIO September 25, 2015 A Brief Look at General Equilibrium Asset Pricing Last week, we saw a general equilibrium model in which banks were irrelevant.

More information