High Dimensional Bayesian Optimisation and Bandits via Additive Models

Size: px
Start display at page:

Download "High Dimensional Bayesian Optimisation and Bandits via Additive Models"

Transcription

1 1/20 High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos ICML 15 July

2 2/20 Bandits & Optimisation Maximum Likelihood inference in Computational Astrophysics E.g: Hubble Constant Baryonic Density Cosmological Simulator Observation

3 2/20 Bandits & Optimisation Maximum Likelihood inference in Computational Astrophysics E.g: Hubble Constant Baryonic Density Cosmological Simulator Observation

4 2/20 Bandits & Optimisation Expensive Blackbox Function

5 2/20 Bandits & Optimisation Expensive Blackbox Function Examples: Hyper-parameter tuning in ML Optimal control strategy in Robotics

6 3/20 Bandits & Optimisation f : [0, 1] D R is an expensive, black-box, nonconvex function. Let x = argmax x f (x). f(x) f(x ) x x

7 3/20 Bandits & Optimisation f : [0, 1] D R is an expensive, black-box, nonconvex function. Let x = argmax x f (x). f(x) x

8 3/20 Bandits & Optimisation f : [0, 1] D R is an expensive, black-box, nonconvex function. Let x = argmax x f (x). f(x) x Optimisation = Minimise Simple Regret. S T = f (x ) max f (x t). x t, t=1,...,t

9 Bandits & Optimisation f : [0, 1] D R is an expensive, black-box, nonconvex function. Let x = argmax x f (x). f(x) x Bandits = Minimise Cumulative Regret. T R T = f (x ) f (x t ). t=1 3/20

10 3/20 Bandits & Optimisation f : [0, 1] D R is an expensive, black-box, nonconvex function. Let x = argmax x f (x). f(x) x Optimisation = Minimise Simple Regret. S T = f (x ) max f (x t). x t, t=1,...,t

11 4/20 Gaussian Process (Bayesian) Optimisation Model f GP(0, κ). f(x) x

12 4/20 Gaussian Process (Bayesian) Optimisation Model f GP(0, κ). f(x) x Obtain posterior GP..

13 Gaussian Process (Bayesian) Optimisation Model f GP(0, κ). f(x) x Maximise acquisition function ϕ t : x t = argmax x ϕ t (x). ϕ t (x) x t = GP-UCB: ϕ t (x) = µ t 1 (x) + β 1/2 t σ t 1 (x) (Srinivas et al. 2010) x 4/20

14 Gaussian Process (Bayesian) Optimisation Model f GP(0, κ). f(x) x Maximise acquisition function ϕ t : x t = argmax x ϕ t (x). ϕ t (x) x t = ϕ t : Expected Improvement (GP-EI), Thompson Sampling etc. x 4/20

15 5/20 Scaling to Higher Dimensions Two Key Challenges: Statistical Difficulty: Nonparametric sample complexity exponential in D. Computational Difficulty: Optimising ϕ t to within ζ accuracy requires O(ζ D ) effort.

16 5/20 Scaling to Higher Dimensions Two Key Challenges: Statistical Difficulty: Nonparametric sample complexity exponential in D. Computational Difficulty: Optimising ϕ t to within ζ accuracy requires O(ζ D ) effort. Existing Work: (Chen et al. 2012): f depends on a small number of variables. Find variables and then GP-UCB. (Wang et al. 2013): f varies along a lower dimensional subspace. GP-EI on a random subspace. (Djolonga et al. 2013): f varies along a lower dimensional subspace. Find subspace and then GP-UCB.

17 5/20 Scaling to Higher Dimensions Two Key Challenges: Statistical Difficulty: Nonparametric sample complexity exponential in D. Computational Difficulty: Optimising ϕ t to within ζ accuracy requires O(ζ D ) effort. Existing Work: Chen et al. 2012, Wang et al. 2013, Djolonga et al Assumes f varies only along a low dimensional subspace. Perform BO on a low dimensional subspace. Assumption too strong in realistic settings.

18 6/20 Additive Functions Structural assumption: f (x) = f (1) (x (1) ) + f (2) (x (2) ) f (M) (x (M) ). x (j) X (j) = [0, 1] d, d D, x (i) x (j) =.

19 6/20 Additive Functions Structural assumption: f (x) = f (1) (x (1) ) + f (2) (x (2) ) f (M) (x (M) ). x (j) X (j) = [0, 1] d, d D, x (i) x (j) =. E.g. f (x {1,...,10} ) = f (1) (x {1,3,9} ) + f (2) (x {2,4,8} ) + f (3) (x {5,6,10} ) Call {X (j)m j=1} = {(1, 3, 9), (2, 4, 8), (5, 6, 10)} the decomposition.

20 6/20 Additive Functions Structural assumption: f (x) = f (1) (x (1) ) + f (2) (x (2) ) f (M) (x (M) ). x (j) X (j) = [0, 1] d, d D, x (i) x (j) =. Assume each f (j) GP(0, κ (j) ). Then f GP(0, κ) where, κ(x, x ) = κ (1) (x (1), x (1) ) + + κ (M) (x (M), x (M) ).

21 6/20 Additive Functions Structural assumption: f (x) = f (1) (x (1) ) + f (2) (x (2) ) f (M) (x (M) ). x (j) X (j) = [0, 1] d, d D, x (i) x (j) =. Assume each f (j) GP(0, κ (j) ). Then f GP(0, κ) where, κ(x, x ) = κ (1) (x (1), x (1) ) + + κ (M) (x (M), x (M) ). Given (X, Y ) = {(x i, y i ) T i=1 }, and test point x, f (j) (x (j) ) X, Y N ( µ (j), σ (j)2 ).

22 7/20 Outline 1. GP-UCB 2. The Add-GP-UCB algorithm Bounds on ST : exponential in D linear in D. An easy-to-optimise acquisition function. Performs well even when f is not additive. 3. Experiments 4. Conclusion & some open questions

23 8/20 GP-UCB x t = argmax x X µ t 1 (x) + β 1/2 t σ t 1 (x)

24 8/20 GP-UCB x t = argmax x X µ t 1 (x) + β 1/2 t σ t 1 (x) Squared Exponential Kernel ( x x κ(x, x 2 ) ) = A exp 2h 2 Theorem (Srinivas et al. 2010) Let f GP(0, κ). Then w.h.p, ( ) D S T O D (log T ) D. T

25 9/20 GP-UCB on additive κ If f GP(0, κ) where κ(x, x ) = κ (1) (x (1), x (1) ) + + κ (M) (x (M), x (M) ). κ (j) SE Kernel.

26 9/20 GP-UCB on additive κ If f GP(0, κ) where κ(x, x ) = κ (1) (x (1), x (1) ) + + κ (M) (x (M), x (M) ). κ (j) SE Kernel. Can be shown: If each κ (j) is a SE kernel, ( ) D S T O 2 d d (log T ) d. T

27 9/20 GP-UCB on additive κ If f GP(0, κ) where κ(x, x ) = κ (1) (x (1), x (1) ) + + κ (M) (x (M), x (M) ). κ (j) SE Kernel. Can be shown: If each κ (j) is a SE kernel, ( ) D S T O 2 d d (log T ) d. T But ϕ t = µ t 1 + β 1/2 t σ t 1 is D-dimensional!

28 10/20 Add-GP-UCB ϕ t (x) = M j=1 µ (j) t 1 (x) + β1/2 t σ (j) t 1 (x (j) ).

29 10/20 Add-GP-UCB ϕ t (x) = M µ (j) t 1 j=1 (x) + β1/2 t t 1 (x (j) ). }{{} ϕ (j) t (x (j) ) σ (j) Maximise each ϕ (j) t separately. Requires only O(poly(D)ζ d ) effort (vs O(ζ D ) for GP-UCB).

30 10/20 Add-GP-UCB ϕ t (x) = M µ (j) t 1 j=1 (x) + β1/2 t t 1 (x (j) ). }{{} ϕ (j) t (x (j) ) σ (j) Maximise each ϕ (j) t separately. Requires only O(poly(D)ζ d ) effort (vs O(ζ D ) for GP-UCB). Theorem Let f (j) GP(0, κ (j) ) and f = j f (j). Then w.h.p, ( ) D S T O 2 d d (log T ) d. T

31 11/20 Summary of Theoretical Results (for SE Kernel) GP-UCB with no assumption on f : S T O (D ) D/2 (log T ) D/2 T 1/2 GP-UCB on additive f : S T O (DT ) 1/2 Maximising ϕ t : O(ζ D ) effort. Add-GP-UCB on additive f : S T O (DT ) 1/2 Maximising ϕ t : O(poly(D)ζ d ) effort.

32 Add-GP-UCB f (x {1,2} ) = f (1) (x {1} ) + f (2) (x {2} ) f (2) (x {2} ) x {2} f (1) (x {1} ) x {1} 12/20

33 Add-GP-UCB f (x {1,2} ) = f (1) (x {1} ) + f (2) (x {2} ) f (2) (x {2} ) x {2} f (1) (x {1} ) x {1} 12/20

34 Add-GP-UCB f (x {1,2} ) = f (1) (x {1} ) + f (2) (x {2} ) x {2} x {1} 12/20

35 Add-GP-UCB f (x {1,2} ) = f (1) (x {1} ) + f (2) (x {2} ) ϕ (2) (x {2} ) x (2) t = x {2} ϕ (1) (x {1} ) x (1) t = x {1} /20

36 Add-GP-UCB f (x {1,2} ) = f (1) (x {1} ) + f (2) (x {2} ) ϕ (2) (x {2} ) x (2) t = x {2} x t = (0.869,0.141) ϕ (1) (x {1} ) x (1) t = x {1} /20

37 13/20 Additive modeling in non-additive settings Additive models common in high dimensional regression. E.g.: Backfitting, MARS, COSSO, RODEO, SpAM etc. f (x {1,...,D} ) = f (x {1} ) + f (x {2} ) + + f (x {D} ).

38 13/20 Additive modeling in non-additive settings Additive models common in high dimensional regression. E.g.: Backfitting, MARS, COSSO, RODEO, SpAM etc. f (x {1,...,D} ) = f (x {1} ) + f (x {2} ) + + f (x {D} ). Additive models are statistically simpler = worse bias, but much better variance in low sample regime.

39 13/20 Additive modeling in non-additive settings Additive models common in high dimensional regression. E.g.: Backfitting, MARS, COSSO, RODEO, SpAM etc. f (x {1,...,D} ) = f (x {1} ) + f (x {2} ) + + f (x {D} ). Additive models are statistically simpler = worse bias, but much better variance in low sample regime. In BO applications queries are expensive. So we usually cannot afford many queries.

40 13/20 Additive modeling in non-additive settings Additive models common in high dimensional regression. E.g.: Backfitting, MARS, COSSO, RODEO, SpAM etc. f (x {1,...,D} ) = f (x {1} ) + f (x {2} ) + + f (x {D} ). Additive models are statistically simpler = worse bias, but much better variance in low sample regime. In BO applications queries are expensive. So we usually cannot afford many queries. Observation: Add-GP-UCB does well even when f is not additive. Better bias/ variance trade-off in high dimensional regression. Easy to maximise acquisition function.

41 14/20 Unknown Kernel/ Decomposition in practice Learn kernel hyper-parameters and decomposition {X j } by maximising GP marginal likelihood periodically.

42 15/20 Experiments 2 Add- : Knows 10 decomposition Add-d/M: M groups of size d Use 1000 DiRect evaluations to maximise acquisition function. DiRect: Dividing Rectangles (Jones et al. 1993)

43 15/20 Experiments Add- : Knows decomposition Add-d/M: M groups of size d Use 4000 DiRect evaluations to maximise acquisition function.

44 16/20 SDSS Luminous Red Galaxies E.g: Hubble Constant Baryonic Density Cosmological Simulator Observation Task: Find maximum likelihood cosmological parameters. 20 Dimensions. But only 9 parameters are relevant. Each query takes 2-5 seconds. Use 500 DiRect evaluations to maximise acquisition function.

45 17/20 SDSS Luminous Red Galaxies REMBO: (Wang et al. 2013)

46 18/20 Viola & Jones Face Detection A cascade of 22 weak classifiers. Image classified negative if the score < threshold at any stage. Task: Find optimal threshold values on a training set of 1000 images. 22 dimensions. Each query takes seconds. Use 1000 DiRect evaluations to maximise acquisition function.

47 19/20 Viola & Jones Face Detection

48 20/20 Summary Additive assumption improves regret: exponential in D linear in D. Acquisition function is easy to maximise. Even for non-additive f is not additive, Add-GP-UCB does well in practice.

49 20/20 Summary Additive assumption improves regret: exponential in D linear in D. Acquisition function is easy to maximise. Even for non-additive f is not additive, Add-GP-UCB does well in practice. Similar results hold for Matérn kernels and in bandit setting.

50 20/20 Summary Additive assumption improves regret: exponential in D linear in D. Acquisition function is easy to maximise. Even for non-additive f is not additive, Add-GP-UCB does well in practice. Similar results hold for Matérn kernels and in bandit setting. Some open questions: How to choose (d, M)? Can we generalise to other acquisition functions?

51 20/20 Summary Additive assumption improves regret: exponential in D linear in D. Acquisition function is easy to maximise. Even for non-additive f is not additive, Add-GP-UCB does well in practice. Similar results hold for Matérn kernels and in bandit setting. Some open questions: How to choose (d, M)? Can we generalise to other acquisition functions? Code available: github.com/kirthevasank/add-gp-bandits Jeff s Talk: Friday Van Gogh Thank You.

Machine Learning for Quantitative Finance

Machine Learning for Quantitative Finance Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing

More information

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved. STAT 509: Statistics for Engineers Dr. Dewei Wang Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger 7 Point CHAPTER OUTLINE 7-1 Point Estimation 7-2

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

Estimation after Model Selection

Estimation after Model Selection Estimation after Model Selection Vanja M. Dukić Department of Health Studies University of Chicago E-Mail: vanja@uchicago.edu Edsel A. Peña* Department of Statistics University of South Carolina E-Mail:

More information

Modeling of Price. Ximing Wu Texas A&M University

Modeling of Price. Ximing Wu Texas A&M University Modeling of Price Ximing Wu Texas A&M University As revenue is given by price times yield, farmers income risk comes from risk in yield and output price. Their net profit also depends on input price, but

More information

Financial Risk Management

Financial Risk Management Financial Risk Management Professor: Thierry Roncalli Evry University Assistant: Enareta Kurtbegu Evry University Tutorial exercices #4 1 Correlation and copulas 1. The bivariate Gaussian copula is given

More information

Laplace approximation

Laplace approximation NPFL108 Bayesian inference Approximate Inference Laplace approximation Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek

More information

Introduction to Sequential Monte Carlo Methods

Introduction to Sequential Monte Carlo Methods Introduction to Sequential Monte Carlo Methods Arnaud Doucet NCSU, October 2008 Arnaud Doucet () Introduction to SMC NCSU, October 2008 1 / 36 Preliminary Remarks Sequential Monte Carlo (SMC) are a set

More information

CS340 Machine learning Bayesian model selection

CS340 Machine learning Bayesian model selection CS340 Machine learning Bayesian model selection Bayesian model selection Suppose we have several models, each with potentially different numbers of parameters. Example: M0 = constant, M1 = straight line,

More information

CS340 Machine learning Bayesian statistics 3

CS340 Machine learning Bayesian statistics 3 CS340 Machine learning Bayesian statistics 3 1 Outline Conjugate analysis of µ and σ 2 Bayesian model selection Summarizing the posterior 2 Unknown mean and precision The likelihood function is p(d µ,λ)

More information

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION INSTITUTE AND FACULTY OF ACTUARIES Curriculum 2019 SPECIMEN EXAMINATION Subject CS1A Actuarial Statistics Time allowed: Three hours and fifteen minutes INSTRUCTIONS TO THE CANDIDATE 1. Enter all the candidate

More information

Regret-based Selection

Regret-based Selection Regret-based Selection David Puelz (UT Austin) Carlos M. Carvalho (UT Austin) P. Richard Hahn (Chicago Booth) May 27, 2017 Two problems 1. Asset pricing: What are the fundamental dimensions (risk factors)

More information

Statistical Models and Methods for Financial Markets

Statistical Models and Methods for Financial Markets Tze Leung Lai/ Haipeng Xing Statistical Models and Methods for Financial Markets B 374756 4Q Springer Preface \ vii Part I Basic Statistical Methods and Financial Applications 1 Linear Regression Models

More information

Tuning bandit algorithms in stochastic environments

Tuning bandit algorithms in stochastic environments Tuning bandit algorithms in stochastic environments Jean-Yves Audibert, CERTIS - Ecole des Ponts Remi Munos, INRIA Futurs Lille Csaba Szepesvári, University of Alberta The 18th International Conference

More information

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is: **BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,

More information

Treatment Allocations Based on Multi-Armed Bandit Strategies

Treatment Allocations Based on Multi-Armed Bandit Strategies Treatment Allocations Based on Multi-Armed Bandit Strategies Wei Qian and Yuhong Yang Applied Economics and Statistics, University of Delaware School of Statistics, University of Minnesota Innovative Statistics

More information

Parameter estimation in SDE:s

Parameter estimation in SDE:s Lund University Faculty of Engineering Statistics in Finance Centre for Mathematical Sciences, Mathematical Statistics HT 2011 Parameter estimation in SDE:s This computer exercise concerns some estimation

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

Multi-Armed Bandit, Dynamic Environments and Meta-Bandits

Multi-Armed Bandit, Dynamic Environments and Meta-Bandits Multi-Armed Bandit, Dynamic Environments and Meta-Bandits C. Hartland, S. Gelly, N. Baskiotis, O. Teytaud and M. Sebag Lab. of Computer Science CNRS INRIA Université Paris-Sud, Orsay, France Abstract This

More information

THE investment in stock market is a common way of

THE investment in stock market is a common way of PROJECT REPORT, MACHINE LEARNING (COMP-652 AND ECSE-608) MCGILL UNIVERSITY, FALL 2018 1 Comparison of Different Algorithmic Trading Strategies on Tesla Stock Price Tawfiq Jawhar, McGill University, Montreal,

More information

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties

Posterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where

More information

Relevant parameter changes in structural break models

Relevant parameter changes in structural break models Relevant parameter changes in structural break models A. Dufays J. Rombouts Forecasting from Complexity April 27 th, 2018 1 Outline Sparse Change-Point models 1. Motivation 2. Model specification Shrinkage

More information

Monte Carlo Methods for Uncertainty Quantification

Monte Carlo Methods for Uncertainty Quantification Monte Carlo Methods for Uncertainty Quantification Abdul-Lateef Haji-Ali Based on slides by: Mike Giles Mathematical Institute, University of Oxford Contemporary Numerical Techniques Haji-Ali (Oxford)

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 14th February 2006 Part VII Session 7: Volatility Modelling Session 7: Volatility Modelling

More information

Theoretical Problems in Credit Portfolio Modeling 2

Theoretical Problems in Credit Portfolio Modeling 2 Theoretical Problems in Credit Portfolio Modeling 2 David X. Li Shanghai Advanced Institute of Finance (SAIF) Shanghai Jiaotong University(SJTU) November 3, 2017 Presented at the University of South California

More information

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples 1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the

More information

Toward a coherent Monte Carlo simulation of CVA

Toward a coherent Monte Carlo simulation of CVA Toward a coherent Monte Carlo simulation of CVA Lokman Abbas-Turki (Joint work with A. I. Bouselmi & M. A. Mikou) TU Berlin January 9, 2013 Lokman (TU Berlin) Advances in Mathematical Finance 1 / 16 Plan

More information

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized

More information

An Introduction to Statistical Extreme Value Theory

An Introduction to Statistical Extreme Value Theory An Introduction to Statistical Extreme Value Theory Uli Schneider Geophysical Statistics Project, NCAR January 26, 2004 NCAR Outline Part I - Two basic approaches to extreme value theory block maxima,

More information

Results for option pricing

Results for option pricing Results for option pricing [o,v,b]=optimal(rand(1,100000 Estimators = 0.4619 0.4617 0.4618 0.4613 0.4619 o = 0.46151 % best linear combination (true value=0.46150 v = 1.1183e-005 %variance per uniform

More information

Mini-Minimax Uncertainty Quantification for Emulators

Mini-Minimax Uncertainty Quantification for Emulators Mini-Minimax Uncertainty Quantification for Emulators http://arxiv.org/abs/1303.3079 Philip B. Stark and Jeffrey C. Regier Department of Statistics University of California, Berkeley 2nd ISNPS Conference

More information

Kernel Conditional Quantile Estimation via Reduction Revisited

Kernel Conditional Quantile Estimation via Reduction Revisited Kernel Conditional Quantile Estimation via Reduction Revisited Novi Quadrianto Novi.Quad@gmail.com The Australian National University, Australia NICTA, Statistical Machine Learning Program, Australia Joint

More information

CSC 411: Lecture 08: Generative Models for Classification

CSC 411: Lecture 08: Generative Models for Classification CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification

More information

Black-Scholes Option Pricing

Black-Scholes Option Pricing Black-Scholes Option Pricing The pricing kernel furnishes an alternate derivation of the Black-Scholes formula for the price of a call option. Arbitrage is again the foundation for the theory. 1 Risk-Free

More information

Monte Carlo Methods in Option Pricing. UiO-STK4510 Autumn 2015

Monte Carlo Methods in Option Pricing. UiO-STK4510 Autumn 2015 Monte Carlo Methods in Option Pricing UiO-STK4510 Autumn 015 The Basics of Monte Carlo Method Goal: Estimate the expectation θ = E[g(X)], where g is a measurable function and X is a random variable such

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood

More information

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.

12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. 12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. References for this Lecture: Robert F. Engle. Autoregressive Conditional Heteroscedasticity with Estimates of Variance

More information

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (30 pts) Answer briefly the following questions. 1. Suppose that

More information

1 Bayesian Bias Correction Model

1 Bayesian Bias Correction Model 1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >

More information

D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018

D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018 D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING Arnoud V. den Boer University of Amsterdam N. Bora Keskin Duke University Rotterdam May 24, 2018 Dynamic pricing and learning: Learning

More information

Semiparametric Modeling, Penalized Splines, and Mixed Models

Semiparametric Modeling, Penalized Splines, and Mixed Models Semi 1 Semiparametric Modeling, Penalized Splines, and Mixed Models David Ruppert Cornell University http://wwworiecornelledu/~davidr January 24 Joint work with Babette Brumback, Ray Carroll, Brent Coull,

More information

European option pricing under parameter uncertainty

European option pricing under parameter uncertainty European option pricing under parameter uncertainty Martin Jönsson (joint work with Samuel Cohen) University of Oxford Workshop on BSDEs, SPDEs and their Applications July 4, 2017 Introduction 2/29 Introduction

More information

Semiparametric Modeling, Penalized Splines, and Mixed Models David Ruppert Cornell University

Semiparametric Modeling, Penalized Splines, and Mixed Models David Ruppert Cornell University Semiparametric Modeling, Penalized Splines, and Mixed Models David Ruppert Cornell University Possible Model SBMD i,j is spinal bone mineral density on ith subject at age equal to age i,j lide http://wwworiecornelledu/~davidr

More information

Chapter 8: Sampling distributions of estimators Sections

Chapter 8: Sampling distributions of estimators Sections Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample

More information

Dynamic Pricing with Varying Cost

Dynamic Pricing with Varying Cost Dynamic Pricing with Varying Cost L. Jeff Hong College of Business City University of Hong Kong Joint work with Ying Zhong and Guangwu Liu Outline 1 Introduction 2 Problem Formulation 3 Pricing Policy

More information

Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4.

Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4. Supplementary Material for Combinatorial Partial Monitoring Game with Linear Feedback and Its Application. A. Full proof for Theorems 4.1 and 4. If the reader will recall, we have the following problem-specific

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Non-informative Priors Multiparameter Models

Non-informative Priors Multiparameter Models Non-informative Priors Multiparameter Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Prior Types Informative vs Non-informative There has been a desire for a prior distributions that

More information

Adaptive Experiments for Policy Choice. March 8, 2019

Adaptive Experiments for Policy Choice. March 8, 2019 Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:

More information

Loss Simulation Model Testing and Enhancement

Loss Simulation Model Testing and Enhancement Loss Simulation Model Testing and Enhancement Casualty Loss Reserve Seminar By Kailan Shang Sept. 2011 Agenda Research Overview Model Testing Real Data Model Enhancement Further Development Enterprise

More information

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] 1 High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] High-frequency data have some unique characteristics that do not appear in lower frequencies. At this class we have: Nonsynchronous

More information

Statistical estimation

Statistical estimation Statistical estimation Statistical modelling: theory and practice Gilles Guillot gigu@dtu.dk September 3, 2013 Gilles Guillot (gigu@dtu.dk) Estimation September 3, 2013 1 / 27 1 Introductory example 2

More information

Improved Inference for Signal Discovery Under Exceptionally Low False Positive Error Rates

Improved Inference for Signal Discovery Under Exceptionally Low False Positive Error Rates Improved Inference for Signal Discovery Under Exceptionally Low False Positive Error Rates (to appear in Journal of Instrumentation) Igor Volobouev & Alex Trindade Dept. of Physics & Astronomy, Texas Tech

More information

Regularizing Bayesian Predictive Regressions. Guanhao Feng

Regularizing Bayesian Predictive Regressions. Guanhao Feng Regularizing Bayesian Predictive Regressions Guanhao Feng Booth School of Business, University of Chicago R/Finance 2017 (Joint work with Nicholas Polson) What do we study? A Bayesian predictive regression

More information

Multilevel quasi-monte Carlo path simulation

Multilevel quasi-monte Carlo path simulation Multilevel quasi-monte Carlo path simulation Michael B. Giles and Ben J. Waterhouse Lluís Antoni Jiménez Rugama January 22, 2014 Index 1 Introduction to MLMC Stochastic model Multilevel Monte Carlo Milstein

More information

Probabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016

Probabilistic Meshless Methods for Bayesian Inverse Problems. Jon Cockayne July 8, 2016 Probabilistic Meshless Methods for Bayesian Inverse Problems Jon Cockayne July 8, 2016 1 Co-Authors Chris Oates Tim Sullivan Mark Girolami 2 What is PN? Many problems in mathematics have no analytical

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

ELEMENTS OF MONTE CARLO SIMULATION

ELEMENTS OF MONTE CARLO SIMULATION APPENDIX B ELEMENTS OF MONTE CARLO SIMULATION B. GENERAL CONCEPT The basic idea of Monte Carlo simulation is to create a series of experimental samples using a random number sequence. According to the

More information

(5) Multi-parameter models - Summarizing the posterior

(5) Multi-parameter models - Summarizing the posterior (5) Multi-parameter models - Summarizing the posterior Spring, 2017 Models with more than one parameter Thus far we have studied single-parameter models, but most analyses have several parameters For example,

More information

Making money in electricity markets

Making money in electricity markets Making money in electricity markets Risk-minimising hedging: from classic machinery to supervised learning Martin Tégner martin.tegner@eng.ox.ac.uk Department of Engineering Science & Oxford-Man Institute

More information

Exact Sampling of Jump-Diffusion Processes

Exact Sampling of Jump-Diffusion Processes 1 Exact Sampling of Jump-Diffusion Processes and Dmitry Smelov Management Science & Engineering Stanford University Exact Sampling of Jump-Diffusion Processes 2 Jump-Diffusion Processes Ubiquitous in finance

More information

1. You are given the following information about a stationary AR(2) model:

1. You are given the following information about a stationary AR(2) model: Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4

More information

Beating the market, using linear regression to outperform the market average

Beating the market, using linear regression to outperform the market average Radboud University Bachelor Thesis Artificial Intelligence department Beating the market, using linear regression to outperform the market average Author: Jelle Verstegen Supervisors: Marcel van Gerven

More information

Weight Smoothing with Laplace Prior and Its Application in GLM Model

Weight Smoothing with Laplace Prior and Its Application in GLM Model Weight Smoothing with Laplace Prior and Its Application in GLM Model Xi Xia 1 Michael Elliott 1,2 1 Department of Biostatistics, 2 Survey Methodology Program, University of Michigan National Cancer Institute

More information

Determining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series

Determining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series Determining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series M.B. de Kock a H.C. Eggers a J. Schmiegel b a University of Stellenbosch, South Africa b Aarhus University, Denmark VI

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Extended Libor Models and Their Calibration

Extended Libor Models and Their Calibration Extended Libor Models and Their Calibration Denis Belomestny Weierstraß Institute Berlin Vienna, 16 November 2007 Denis Belomestny (WIAS) Extended Libor Models and Their Calibration Vienna, 16 November

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Financial Times Series. Lecture 6

Financial Times Series. Lecture 6 Financial Times Series Lecture 6 Extensions of the GARCH There are numerous extensions of the GARCH Among the more well known are EGARCH (Nelson 1991) and GJR (Glosten et al 1993) Both models allow for

More information

Modeling the extremes of temperature time series. Debbie J. Dupuis Department of Decision Sciences HEC Montréal

Modeling the extremes of temperature time series. Debbie J. Dupuis Department of Decision Sciences HEC Montréal Modeling the extremes of temperature time series Debbie J. Dupuis Department of Decision Sciences HEC Montréal Outline Fig. 1: S&P 500. Daily negative returns (losses), Realized Variance (RV) and Jump

More information

Probability & Statistics

Probability & Statistics Probability & Statistics BITS Pilani K K Birla Goa Campus Dr. Jajati Keshari Sahoo Department of Mathematics Statistics Descriptive statistics Inferential statistics /38 Inferential Statistics 1. Involves:

More information

Optimal Portfolio Choice under Decision-Based Model Combinations

Optimal Portfolio Choice under Decision-Based Model Combinations Optimal Portfolio Choice under Decision-Based Model Combinations Davide Pettenuzzo Brandeis University Francesco Ravazzolo Norges Bank BI Norwegian Business School November 13, 2014 Pettenuzzo Ravazzolo

More information

Monitoring Accrual and Events in a Time-to-Event Endpoint Trial. BASS November 2, 2015 Jeff Palmer

Monitoring Accrual and Events in a Time-to-Event Endpoint Trial. BASS November 2, 2015 Jeff Palmer Monitoring Accrual and Events in a Time-to-Event Endpoint Trial BASS November 2, 2015 Jeff Palmer Introduction A number of things can go wrong in a survival study, especially if you have a fixed end of

More information

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018 ` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.

More information

"Pricing Exotic Options using Strong Convergence Properties

Pricing Exotic Options using Strong Convergence Properties Fourth Oxford / Princeton Workshop on Financial Mathematics "Pricing Exotic Options using Strong Convergence Properties Klaus E. Schmitz Abe schmitz@maths.ox.ac.uk www.maths.ox.ac.uk/~schmitz Prof. Mike

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators

More information

On modelling of electricity spot price

On modelling of electricity spot price , Rüdiger Kiesel and Fred Espen Benth Institute of Energy Trading and Financial Services University of Duisburg-Essen Centre of Mathematics for Applications, University of Oslo 25. August 2010 Introduction

More information

arxiv: v1 [math.st] 18 Sep 2018

arxiv: v1 [math.st] 18 Sep 2018 Gram Charlier and Edgeworth expansion for sample variance arxiv:809.06668v [math.st] 8 Sep 08 Eric Benhamou,* A.I. SQUARE CONNECT, 35 Boulevard d Inkermann 900 Neuilly sur Seine, France and LAMSADE, Universit

More information

Market Risk Analysis Volume II. Practical Financial Econometrics

Market Risk Analysis Volume II. Practical Financial Econometrics Market Risk Analysis Volume II Practical Financial Econometrics Carol Alexander John Wiley & Sons, Ltd List of Figures List of Tables List of Examples Foreword Preface to Volume II xiii xvii xx xxii xxvi

More information

Modelling Returns: the CER and the CAPM

Modelling Returns: the CER and the CAPM Modelling Returns: the CER and the CAPM Carlo Favero Favero () Modelling Returns: the CER and the CAPM 1 / 20 Econometric Modelling of Financial Returns Financial data are mostly observational data: they

More information

UPDATED IAA EDUCATION SYLLABUS

UPDATED IAA EDUCATION SYLLABUS II. UPDATED IAA EDUCATION SYLLABUS A. Supporting Learning Areas 1. STATISTICS Aim: To enable students to apply core statistical techniques to actuarial applications in insurance, pensions and emerging

More information

GPD-POT and GEV block maxima

GPD-POT and GEV block maxima Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD,

More information

A New Hybrid Estimation Method for the Generalized Pareto Distribution

A New Hybrid Estimation Method for the Generalized Pareto Distribution A New Hybrid Estimation Method for the Generalized Pareto Distribution Chunlin Wang Department of Mathematics and Statistics University of Calgary May 18, 2011 A New Hybrid Estimation Method for the GPD

More information

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems.

Practice Exercises for Midterm Exam ST Statistical Theory - II The ACTUAL exam will consists of less number of problems. Practice Exercises for Midterm Exam ST 522 - Statistical Theory - II The ACTUAL exam will consists of less number of problems. 1. Suppose X i F ( ) for i = 1,..., n, where F ( ) is a strictly increasing

More information

Equity correlations implied by index options: estimation and model uncertainty analysis

Equity correlations implied by index options: estimation and model uncertainty analysis 1/18 : estimation and model analysis, EDHEC Business School (joint work with Rama COT) Modeling and managing financial risks Paris, 10 13 January 2011 2/18 Outline 1 2 of multi-asset models Solution to

More information

Estimation of dynamic term structure models

Estimation of dynamic term structure models Estimation of dynamic term structure models Greg Duffee Haas School of Business, UC-Berkeley Joint with Richard Stanton, Haas School Presentation at IMA Workshop, May 2004 (full paper at http://faculty.haas.berkeley.edu/duffee)

More information

Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model

Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model Kenneth Beauchemin Federal Reserve Bank of Minneapolis January 2015 Abstract This memo describes a revision to the mixed-frequency

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Bivariate Birnbaum-Saunders Distribution

Bivariate Birnbaum-Saunders Distribution Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators

More information

Online Network Revenue Management using Thompson Sampling

Online Network Revenue Management using Thompson Sampling Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira David Simchi-Levi He Wang Working Paper 16-031 Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira

More information

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ. 9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.

More information

Anumericalalgorithm for general HJB equations : a jump-constrained BSDE approach

Anumericalalgorithm for general HJB equations : a jump-constrained BSDE approach Anumericalalgorithm for general HJB equations : a jump-constrained BSDE approach Nicolas Langrené Univ. Paris Diderot - Sorbonne Paris Cité, LPMA, FiME Joint work with Idris Kharroubi (Paris Dauphine),

More information

STATISTICS and PROBABILITY

STATISTICS and PROBABILITY Introduction to Statistics Atatürk University STATISTICS and PROBABILITY LECTURE: SAMPLING DISTRIBUTIONS and POINT ESTIMATIONS Prof. Dr. İrfan KAYMAZ Atatürk University Engineering Faculty Department of

More information

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (42 pts) Answer briefly the following questions. 1. Questions

More information

Linear-Rational Term-Structure Models

Linear-Rational Term-Structure Models Linear-Rational Term-Structure Models Anders Trolle (joint with Damir Filipović and Martin Larsson) Ecole Polytechnique Fédérale de Lausanne Swiss Finance Institute AMaMeF and Swissquote Conference, September

More information

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop Hierarchical Generalized Linear Models Measurement Incorporated Hierarchical Linear Models Workshop Hierarchical Generalized Linear Models So now we are moving on to the more advanced type topics. To begin

More information

Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India

Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India Accelerated Stochastic Gradient Descent Praneeth Netrapalli MSR India Presented at OSL workshop, Les Houches, France. Joint work with Prateek Jain, Sham M. Kakade, Rahul Kidambi and Aaron Sidford Linear

More information

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach Lei Jiang Tsinghua University Ke Wu Renmin University of China Guofu Zhou Washington University in St. Louis August 2017 Jiang,

More information

Structural GARCH: The Volatility-Leverage Connection

Structural GARCH: The Volatility-Leverage Connection Structural GARCH: The Volatility-Leverage Connection Robert Engle 1 Emil Siriwardane 1 1 NYU Stern School of Business University of Chicago: 11/25/2013 Leverage and Equity Volatility I Crisis highlighted

More information