Multi-armed bandits in dynamic pricing
|
|
- Baldwin Webster
- 5 years ago
- Views:
Transcription
1 Multi-armed bandits in dynamic pricing Arnoud den Boer University of Twente, Centrum Wiskunde & Informatica Amsterdam Lancaster, January 11, 2016
2 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods.
3 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods. Each period t = 1,..., T : (i) choose selling price p t ;
4 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods. Each period t = 1,..., T : (i) choose selling price p t ; (ii) observe demand d t = θ 1 + θ 2 p t + ɛ t, where θ = (θ 1, θ 2 ) are unknown parameters in known set Θ, ɛ t unobservable random disturbance term;
5 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods. Each period t = 1,..., T : (i) choose selling price p t ; (ii) observe demand d t = θ 1 + θ 2 p t + ɛ t, where θ = (θ 1, θ 2 ) are unknown parameters in known set Θ, ɛ t unobservable random disturbance term; (iii) collect revenue p t d t.
6 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods. Each period t = 1,..., T : (i) choose selling price p t ; (ii) observe demand d t = θ 1 + θ 2 p t + ɛ t, where θ = (θ 1, θ 2 ) are unknown parameters in known set Θ, ɛ t unobservable random disturbance term; (iii) collect revenue p t d t. Which non-anticipating prices [ p 1,..., p T maximize cumulative T ] expected revenue min θ Θ E t=1 p td t?
7 Dynamic pricing A firm sells a product, with abundant inventory, during T N discrete time periods. Each period t = 1,..., T : (i) choose selling price p t ; (ii) observe demand d t = θ 1 + θ 2 p t + ɛ t, where θ = (θ 1, θ 2 ) are unknown parameters in known set Θ, ɛ t unobservable random disturbance term; (iii) collect revenue p t d t. Which non-anticipating prices [ p 1,..., p T maximize cumulative T ] expected revenue min θ Θ E t=1 p td t? Intractable problem
8 Myopic pricing An intuitive solution: Choose arbitrary initial prices p 1 p 2.
9 Myopic pricing An intuitive solution: Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p perceived optimal decision
10 Myopic pricing An intuitive solution: Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p perceived optimal decision Always choose the perceived optimal action.
11 Convergence Does ˆθ t converge to θ as t?
12 Convergence Does ˆθ t converge to θ as t? No It seems that ˆθ t always converges, but w.p. zero to the true θ. Open problem.
13 Convergence Does ˆθ t converge to θ as t? No It seems that ˆθ t always converges, but w.p. zero to the true θ. Open problem. Caused by the prevalence of indeterminate equilibria: Parameter estimates such that the true expected demand at the myopic optimal price equals the predicted expected demand.
14 Indeterminate equilibria If ˆθ suff. close to θ, then arg max p (ˆθ 1 + ˆθ 2 p) = ˆθ 1 /(2ˆθ 2 ). p Then: True expected demand: θ 1 + θ 2 ˆθ 1 2ˆθ 2. (1) Predicted expected demand: ˆθ 1 + ˆθ 2 ˆθ 1 2ˆθ 2. (2)
15 Indeterminate equilibria If ˆθ suff. close to θ, then arg max p (ˆθ 1 + ˆθ 2 p) = ˆθ 1 /(2ˆθ 2 ). p Then: True expected demand: θ 1 + θ 2 ˆθ 1 2ˆθ 2. (1) Predicted expected demand: ˆθ 1 + ˆθ 2 ˆθ 1 2ˆθ 2. (2) If (1) equals (2), then ˆθ is an IE. Model output confirms correctness of the (incorrect) estimates.
16 Indeterminate equilibria: example
17 Back to original problem Which non-anticipating prices p 1,..., p T maximize [ min E T p t d t ], θ Θ t=1 or, equivalently, minimize the Regret(T ) [ max E T max p (θ 1 + θ 2 p) θ Θ p T ] p t d t t=1
18 Back to original problem Which non-anticipating prices p 1,..., p T maximize [ min E T p t d t ], θ Θ t=1 or, equivalently, minimize the Regret(T ) [ max E T max p (θ 1 + θ 2 p) θ Θ p Exact solution intractable T ] p t d t t=1
19 Back to original problem Which non-anticipating prices p 1,..., p T maximize [ min E T p t d t ], θ Θ t=1 or, equivalently, minimize the Regret(T ) [ max E T max p (θ 1 + θ 2 p) θ Θ p Exact solution intractable Myopic pricing not optimal T ] p t d t t=1
20 Back to original problem Which non-anticipating prices p 1,..., p T maximize [ min E T p t d t ], θ Θ t=1 or, equivalently, minimize the Regret(T ) [ max E T max p (θ 1 + θ 2 p) θ Θ p Exact solution intractable Myopic pricing not optimal T ] p t d t t=1 Let s find asymptotically optimal policies: smallest growth rate of Regret(T ) in T.
21 Asymptotically optimal policy Important observation: Variation in controls better estimates.
22 Asymptotically optimal policy Important observation: Variation in controls better estimates. ( ) ˆθ t θ 2 log t = O tvar(p 1,..., p t ) a.s. Lai and Wei, Annals of Statistics, 1982.
23 Asymptotically optimal policy Important observation: Variation in controls better estimates. ( ) ˆθ t θ 2 log t = O tvar(p 1,..., p t ) a.s. Lai and Wei, Annals of Statistics, To ensure convergence of ˆθ t, some amount of experimentation is necessary.
24 Asymptotically optimal policy Important observation: Variation in controls better estimates. ( ) ˆθ t θ 2 log t = O tvar(p 1,..., p t ) a.s. Lai and Wei, Annals of Statistics, To ensure convergence of ˆθ t, some amount of experimentation is necessary. But, not too much.
25 Controlled Variance pricing Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p
26 Controlled Variance pricing Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p perceived optimal decision
27 Controlled Variance pricing Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p s.t. t Var(p 1,..., p t+1 ) f (t), perceived optimal decision information constraint
28 Controlled Variance pricing Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p s.t. t Var(p 1,..., p t+1 ) f (t), perceived optimal decision information constraint for some increasing f : N (0, ).
29 Controlled Variance pricing Choose arbitrary initial prices p 1 p 2. For each t 2: (i) determine LS estimate ˆθ t of θ, based on available sales data; (ii) set p t+1 = arg max p (ˆθ t1 + ˆθ t2 p) p s.t. t Var(p 1,..., p t+1 ) f (t), perceived optimal decision information constraint for some increasing f : N (0, ). Always choose the perceived optimal action that induces sufficient experimentation.
30 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t).
31 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t). f balances between exploration and exploitation.
32 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t). f balances between exploration and exploitation. Optimal f gives Regret(T ) = O( T log T ).
33 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t). f balances between exploration and exploitation. Optimal f gives Regret(T ) = O( T log T ). No policy beats T.
34 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t). f balances between exploration and exploitation. Optimal f gives Regret(T ) = O( T log T ). No policy beats T. Thus, you can characterize asymptotically (near)-optimal amount of experimentation.
35 Controlled Variance pricing - performance Regret(T ) = O ( f (T ) + T t=1 ) log t f (t). f balances between exploration and exploitation. Optimal f gives Regret(T ) = O( T log T ). No policy beats T. Thus, you can characterize asymptotically (near)-optimal amount of experimentation. (the optimal constant is not yet known, in general).
36 Extension: multiple products K products: price vector ( p t = ) (p t (1),..., p t (K)), 1 demand vector d t = θ + ɛ, matrix θ, noise-vector ɛ. p t
37 Extension: multiple products K products: price vector ( p t = ) (p t (1),..., p t (K)), 1 demand vector d t = θ + ɛ, matrix θ, noise-vector ɛ. p t Convergence rates of LS-estimator: ( ) ˆθ t θ 2 log t = O a.s., λ min (t) where λ min (t) is the smallest eigenvalue of the information matrix t ( 1 p i ) i=1 p i p i p i
38 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p )
39 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p ) perceived optimal decision
40 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p s.t. λ min (t + 1) f (t), ) perceived optimal decision information constraint
41 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p s.t. λ min (t + 1) f (t), for some increasing f : N (0, ). ) perceived optimal decision information constraint
42 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p s.t. λ min (t + 1) f (t), for some increasing f : N (0, ). ) perceived optimal decision information constraint Problem: λ min (t + 1) is a complicated object.
43 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p s.t. λ min (t + 1) f (t), for some increasing f : N (0, ). ) perceived optimal decision information constraint Problem: λ min (t + 1) is a complicated object. Convertible to non-convex but tractable quadratic constraint.
44 Extension: multiple products Same type of policy: p t+1 = arg max p ˆθ ( 1 t p p s.t. λ min (t + 1) f (t), for some increasing f : N (0, ). ) perceived optimal decision information constraint Problem: λ min (t + 1) is a complicated object. Convertible to non-convex but tractable quadratic constraint. ( Regret(T ) = O f (T ) + ) T log t t=1 f (t), optimal f gives Regret(T ) = O( T log T ).
45 Many more extensions Non-linear demand functions (generalized linear models) E[D(p)] = h(θ 1 + θ 2 p);
46 Many more extensions Non-linear demand functions (generalized linear models) E[D(p)] = h(θ 1 + θ 2 p); Time-varying markets (how much data to use for inference?)
47 Many more extensions Non-linear demand functions (generalized linear models) E[D(p)] = h(θ 1 + θ 2 p); Time-varying markets (how much data to use for inference?) Strategic customer behavior (can you detect this from data?)
48 Many more extensions Non-linear demand functions (generalized linear models) E[D(p)] = h(θ 1 + θ 2 p); Time-varying markets (how much data to use for inference?) Strategic customer behavior (can you detect this from data?) Competition (repeated games with incomplete information? Mean field games with learning?)
49 Many more extensions Non-linear demand functions (generalized linear models) E[D(p)] = h(θ 1 + θ 2 p); Time-varying markets (how much data to use for inference?) Strategic customer behavior (can you detect this from data?) Competition (repeated games with incomplete information? Mean field games with learning?) den Boer (2015) Surveys in Operations Research and Management Science 20(1)
50 Why a parametric demand model? d t = θ 1 + θ 2 p t + ɛ t...
51 Why a parametric demand model? d t = θ 1 + θ 2 p t + ɛ t... Preferred by price managers By smartly choosing experimentation prices converging to the optimal price, you can hedge against misspecified linear demand.
52 Can t this log-term be removed? Regret(T ) = O( T log T ) Convergence rates of LS estimators: not completely understood Does more data lead to better estimators?
53 Pricing airline tickets Sell C N perishable products during (consecutive) selling season of S N periods
54 Pricing airline tickets Sell C N perishable products during (consecutive) selling season of S N periods Demand in period t is Bernoulli h(β 0 + β 1 p t ), unknown β 0, β 1. Goal of the firm: maximize total expected revenue.
55 Full-information solution If demand distribution known: Markov decision problem. C c 0 1 s S Optimal prices π β (c, s) [p l, p h ] for each pair (c, s) of remaining inventory c {0, 1,..., C} and stage s {1,..., S}.
56 Pricing airline tickets: incomplete information Neglecting some technicalities, certainty-equivalent pricing performs well! I.e., if in period t state is (c t, s t ), use price π ˆβt (c t, s t ),
57 Pricing airline tickets: incomplete information Neglecting some technicalities, certainty-equivalent pricing performs well! I.e., if in period t state is (c t, s t ), use price π ˆβt (c t, s t ),
58 Pricing airline tickets: endogenous learning Reason for good performance: endogenous learning property
59 Pricing airline tickets: endogenous learning Reason for good performance: endogenous learning property The optimal price πβ (c, s) depends on marginal value of inventory This quantity changing throughout the selling season Thus, natural price dispersion if πβ is used
60 Pricing airline tickets: endogenous learning Reason for good performance: endogenous learning property The optimal price πβ (c, s) depends on marginal value of inventory This quantity changing throughout the selling season Thus, natural price dispersion if π β is used By continuity arguments: price dispersion if ˆβ t close to β, for all t in selling season
61 Pricing airline tickets: endogenous learning Reason for good performance: endogenous learning property The optimal price πβ (c, s) depends on marginal value of inventory This quantity changing throughout the selling season Thus, natural price dispersion if π β is used By continuity arguments: price dispersion if ˆβ t close to β, for all t in selling season Endogenous learning causes fast converge of estimates: [ ] ( ) E ˆβ(t) β (0) 2 log(t) = O t
Multi-armed bandit problems
Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before
More informationD I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING. Rotterdam May 24, 2018
D I S C O N T I N U O U S DEMAND FUNCTIONS: ESTIMATION AND PRICING Arnoud V. den Boer University of Amsterdam N. Bora Keskin Duke University Rotterdam May 24, 2018 Dynamic pricing and learning: Learning
More informationLecture 17: More on Markov Decision Processes. Reinforcement learning
Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture
More informationDynamic Pricing for Competing Sellers
Clemson University TigerPrints All Theses Theses 8-2015 Dynamic Pricing for Competing Sellers Liu Zhu Clemson University, liuz@clemson.edu Follow this and additional works at: https://tigerprints.clemson.edu/all_theses
More informationDynamic Pricing with Varying Cost
Dynamic Pricing with Varying Cost L. Jeff Hong College of Business City University of Hong Kong Joint work with Ying Zhong and Guangwu Liu Outline 1 Introduction 2 Problem Formulation 3 Pricing Policy
More informationLecture 7: Bayesian approach to MAB - Gittins index
Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach
More information6.896 Topics in Algorithmic Game Theory February 10, Lecture 3
6.896 Topics in Algorithmic Game Theory February 0, 200 Lecture 3 Lecturer: Constantinos Daskalakis Scribe: Pablo Azar, Anthony Kim In the previous lecture we saw that there always exists a Nash equilibrium
More informationThe Irrevocable Multi-Armed Bandit Problem
The Irrevocable Multi-Armed Bandit Problem Ritesh Madan Qualcomm-Flarion Technologies May 27, 2009 Joint work with Vivek Farias (MIT) 2 Multi-Armed Bandit Problem n arms, where each arm i is a Markov Decision
More informationThe method of Maximum Likelihood.
Maximum Likelihood The method of Maximum Likelihood. In developing the least squares estimator - no mention of probabilities. Minimize the distance between the predicted linear regression and the observed
More informationLecture 11: Bandits with Knapsacks
CMSC 858G: Bandits, Experts and Games 11/14/16 Lecture 11: Bandits with Knapsacks Instructor: Alex Slivkins Scribed by: Mahsa Derakhshan 1 Motivating Example: Dynamic Pricing The basic version of the dynamic
More informationEuropean option pricing under parameter uncertainty
European option pricing under parameter uncertainty Martin Jönsson (joint work with Samuel Cohen) University of Oxford Workshop on BSDEs, SPDEs and their Applications July 4, 2017 Introduction 2/29 Introduction
More information1 Dynamic programming
1 Dynamic programming A country has just discovered a natural resource which yields an income per period R measured in terms of traded goods. The cost of exploitation is negligible. The government wants
More informationPricing Problems under the Markov Chain Choice Model
Pricing Problems under the Markov Chain Choice Model James Dong School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jd748@cornell.edu A. Serdar Simsek
More informationA Dynamic Network Model of the Unsecured Interbank Lending Market 1
A Dynamic Network Model of the Unsecured Interbank Lending Market 1 Francisco Blasques a Falk Bräuning b Iman van Lelyveld a,c a VU University Amsterdam b Federal Reserve Bank of Boston c De Nederlandsche
More informationBudget Management In GSP (2018)
Budget Management In GSP (2018) Yahoo! March 18, 2018 Miguel March 18, 2018 1 / 26 Today s Presentation: Budget Management Strategies in Repeated auctions, Balseiro, Kim, and Mahdian, WWW2017 Learning
More informationEE641 Digital Image Processing II: Purdue University VISE - October 29,
EE64 Digital Image Processing II: Purdue University VISE - October 9, 004 The EM Algorithm. Suffient Statistics and Exponential Distributions Let p(y θ) be a family of density functions parameterized by
More informationChapter 10 Inventory Theory
Chapter 10 Inventory Theory 10.1. (a) Find the smallest n such that g(n) 0. g(1) = 3 g(2) =2 n = 2 (b) Find the smallest n such that g(n) 0. g(1) = 1 25 1 64 g(2) = 1 4 1 25 g(3) =1 1 4 g(4) = 1 16 1
More informationCrises and Prices: Information Aggregation, Multiplicity and Volatility
: Information Aggregation, Multiplicity and Volatility Reading Group UC3M G.M. Angeletos and I. Werning November 09 Motivation Modelling Crises I There is a wide literature analyzing crises (currency attacks,
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationAn Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking
An Approximation Algorithm for Capacity Allocation over a Single Flight Leg with Fare-Locking Mika Sumida School of Operations Research and Information Engineering, Cornell University, Ithaca, New York
More informationMartingale Pricing Theory in Discrete-Time and Discrete-Space Models
IEOR E4707: Foundations of Financial Engineering c 206 by Martin Haugh Martingale Pricing Theory in Discrete-Time and Discrete-Space Models These notes develop the theory of martingale pricing in a discrete-time,
More information14.461: Technological Change, Lectures 12 and 13 Input-Output Linkages: Implications for Productivity and Volatility
14.461: Technological Change, Lectures 12 and 13 Input-Output Linkages: Implications for Productivity and Volatility Daron Acemoglu MIT October 17 and 22, 2013. Daron Acemoglu (MIT) Input-Output Linkages
More informationEC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods
EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions
More informationCS 343: Artificial Intelligence
CS 343: Artificial Intelligence Markov Decision Processes II Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC
More informationPortfolio theory and risk management Homework set 2
Portfolio theory and risk management Homework set Filip Lindskog General information The homework set gives at most 3 points which are added to your result on the exam. You may work individually or in
More informationChapter 4: Asymptotic Properties of MLE (Part 3)
Chapter 4: Asymptotic Properties of MLE (Part 3) Daniel O. Scharfstein 09/30/13 1 / 1 Breakdown of Assumptions Non-Existence of the MLE Multiple Solutions to Maximization Problem Multiple Solutions to
More informationRevenue Management Under the Markov Chain Choice Model
Revenue Management Under the Markov Chain Choice Model Jacob B. Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jbf232@cornell.edu Huseyin
More informationOnline Network Revenue Management using Thompson Sampling
Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira David Simchi-Levi He Wang Working Paper 16-031 Online Network Revenue Management using Thompson Sampling Kris Johnson Ferreira
More information4 Reinforcement Learning Basic Algorithms
Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems
More informationFramework and Methods for Infrastructure Management. Samer Madanat UC Berkeley NAS Infrastructure Management Conference, September 2005
Framework and Methods for Infrastructure Management Samer Madanat UC Berkeley NAS Infrastructure Management Conference, September 2005 Outline 1. Background: Infrastructure Management 2. Flowchart for
More informationDepartment of Agricultural Economics. PhD Qualifier Examination. August 2010
Department of Agricultural Economics PhD Qualifier Examination August 200 Instructions: The exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,
More informationSTOCHASTIC REPUTATION DYNAMICS UNDER DUOPOLY COMPETITION
STOCHASTIC REPUTATION DYNAMICS UNDER DUOPOLY COMPETITION BINGCHAO HUANGFU Abstract This paper studies a dynamic duopoly model of reputation-building in which reputations are treated as capital stocks that
More informationUnobserved Heterogeneity Revisited
Unobserved Heterogeneity Revisited Robert A. Miller Dynamic Discrete Choice March 2018 Miller (Dynamic Discrete Choice) cemmap 7 March 2018 1 / 24 Distributional Assumptions about the Unobserved Variables
More informationModelling, Estimation and Hedging of Longevity Risk
IA BE Summer School 2016, K. Antonio, UvA 1 / 50 Modelling, Estimation and Hedging of Longevity Risk Katrien Antonio KU Leuven and University of Amsterdam IA BE Summer School 2016, Leuven Module II: Fitting
More informationAnswer Key for M. A. Economics Entrance Examination 2017 (Main version)
Answer Key for M. A. Economics Entrance Examination 2017 (Main version) July 4, 2017 1. Person A lexicographically prefers good x to good y, i.e., when comparing two bundles of x and y, she strictly prefers
More informationAgricultural and Applied Economics 637 Applied Econometrics II
Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make
More informationHandout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems
SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 8: Introduction to Stochastic Dynamic Programming Instructor: Shiqian Ma March 10, 2014 Suggested Reading: Chapter 1 of Bertsekas,
More informationGMM Estimation. 1 Introduction. 2 Consumption-CAPM
GMM Estimation 1 Introduction Modern macroeconomic models are typically based on the intertemporal optimization and rational expectations. The Generalized Method of Moments (GMM) is an econometric framework
More informationChapter 4 Topics. Behavior of the representative consumer Behavior of the representative firm Pearson Education, Inc.
Chapter 4 Topics Behavior of the representative consumer Behavior of the representative firm 1-1 Representative Consumer Consumer s preferences over consumption and leisure as represented by indifference
More informationGMM for Discrete Choice Models: A Capital Accumulation Application
GMM for Discrete Choice Models: A Capital Accumulation Application Russell Cooper, John Haltiwanger and Jonathan Willis January 2005 Abstract This paper studies capital adjustment costs. Our goal here
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More information1 Appendix A: Definition of equilibrium
Online Appendix to Partnerships versus Corporations: Moral Hazard, Sorting and Ownership Structure Ayca Kaya and Galina Vereshchagina Appendix A formally defines an equilibrium in our model, Appendix B
More informationChapter One NOISY RATIONAL EXPECTATIONS WITH STOCHASTIC FUNDAMENTALS
9 Chapter One NOISY RATIONAL EXPECTATIONS WITH STOCHASTIC FUNDAMENTALS 0 Introduction Models of trading behavior often use the assumption of rational expectations to describe how traders form beliefs about
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More informationTreatment Allocations Based on Multi-Armed Bandit Strategies
Treatment Allocations Based on Multi-Armed Bandit Strategies Wei Qian and Yuhong Yang Applied Economics and Statistics, University of Delaware School of Statistics, University of Minnesota Innovative Statistics
More informationEconomics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints
Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints David Laibson 9/11/2014 Outline: 1. Precautionary savings motives 2. Liquidity constraints 3. Application: Numerical solution
More informationSentiments and Aggregate Fluctuations
Sentiments and Aggregate Fluctuations Jess Benhabib Pengfei Wang Yi Wen March 15, 2013 Jess Benhabib Pengfei Wang Yi Wen () Sentiments and Aggregate Fluctuations March 15, 2013 1 / 60 Introduction The
More informationMean-Variance Analysis
Mean-Variance Analysis Mean-variance analysis 1/ 51 Introduction How does one optimally choose among multiple risky assets? Due to diversi cation, which depends on assets return covariances, the attractiveness
More informationEstimating Term Structure of U.S. Treasury Securities: An Interpolation Approach
Estimating Term Structure of U.S. Treasury Securities: An Interpolation Approach Feng Guo J. Huston McCulloch Our Task Empirical TS are unobservable. Without a continuous spectrum of zero-coupon securities;
More informationThe revenue management literature for queues typically assumes that providers know the distribution of
MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 15, No. 2, Spring 2013, pp. 292 304 ISSN 1523-4614 (print) ISSN 1526-5498 (online) http://dx.doi.org/10.1287/msom.1120.0418 2013 INFORMS Bayesian Dynamic
More informationEco504 Spring 2010 C. Sims MID-TERM EXAM. (1) (45 minutes) Consider a model in which a representative agent has the objective. B t 1.
Eco504 Spring 2010 C. Sims MID-TERM EXAM (1) (45 minutes) Consider a model in which a representative agent has the objective function max C,K,B t=0 β t C1 γ t 1 γ and faces the constraints at each period
More information1 Overview. 2 The Gradient Descent Algorithm. AM 221: Advanced Optimization Spring 2016
AM 22: Advanced Optimization Spring 206 Prof. Yaron Singer Lecture 9 February 24th Overview In the previous lecture we reviewed results from multivariate calculus in preparation for our journey into convex
More informationWhat can we do with numerical optimization?
Optimization motivation and background Eddie Wadbro Introduction to PDE Constrained Optimization, 2016 February 15 16, 2016 Eddie Wadbro, Introduction to PDE Constrained Optimization, February 15 16, 2016
More informationA Decentralized Learning Equilibrium
Paper to be presented at the DRUID Society Conference 2014, CBS, Copenhagen, June 16-18 A Decentralized Learning Equilibrium Andreas Blume University of Arizona Economics ablume@email.arizona.edu April
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}
More informationPhD Qualifier Examination
PhD Qualifier Examination Department of Agricultural Economics May 29, 2014 Instructions This exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,
More informationEconometric Methods for Valuation Analysis
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric
More informationMuch of what appears here comes from ideas presented in the book:
Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many
More informationA simple wealth model
Quantitative Macroeconomics Raül Santaeulàlia-Llopis, MOVE-UAB and Barcelona GSE Homework 5, due Thu Nov 1 I A simple wealth model Consider the sequential problem of a household that maximizes over streams
More informationReinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration
Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision
More informationReinforcement Learning and Simulation-Based Search
Reinforcement Learning and Simulation-Based Search David Silver Outline 1 Reinforcement Learning 2 3 Planning Under Uncertainty Reinforcement Learning Markov Decision Process Definition A Markov Decision
More informationSentiments and Aggregate Fluctuations
Sentiments and Aggregate Fluctuations Jess Benhabib Pengfei Wang Yi Wen June 15, 2012 Jess Benhabib Pengfei Wang Yi Wen () Sentiments and Aggregate Fluctuations June 15, 2012 1 / 59 Introduction We construct
More informationFinancial Giffen Goods: Examples and Counterexamples
Financial Giffen Goods: Examples and Counterexamples RolfPoulsen and Kourosh Marjani Rasmussen Abstract In the basic Markowitz and Merton models, a stock s weight in efficient portfolios goes up if its
More informationGeneralized Multi-Factor Commodity Spot Price Modeling through Dynamic Cournot Resource Extraction Models
Generalized Multi-Factor Commodity Spot Price Modeling through Dynamic Cournot Resource Extraction Models Bilkan Erkmen (joint work with Michael Coulon) Workshop on Stochastic Games, Equilibrium, and Applications
More informationThe test has 13 questions. Answer any four. All questions carry equal (25) marks.
2014 Booklet No. TEST CODE: QEB Afternoon Questions: 4 Time: 2 hours Write your Name, Registration Number, Test Code, Question Booklet Number etc. in the appropriate places of the answer booklet. The test
More informationMixed strategies in PQ-duopolies
19th International Congress on Modelling and Simulation, Perth, Australia, 12 16 December 2011 http://mssanz.org.au/modsim2011 Mixed strategies in PQ-duopolies D. Cracau a, B. Franz b a Faculty of Economics
More informationThe Analytics of Information and Uncertainty Answers to Exercises and Excursions
The Analytics of Information and Uncertainty Answers to Exercises and Excursions Chapter 6: Information and Markets 6.1 The inter-related equilibria of prior and posterior markets Solution 6.1.1. The condition
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More informationPortfolio Management and Optimal Execution via Convex Optimization
Portfolio Management and Optimal Execution via Convex Optimization Enzo Busseti Stanford University April 9th, 2018 Problems portfolio management choose trades with optimization minimize risk, maximize
More informationThe Optimization Process: An example of portfolio optimization
ISyE 6669: Deterministic Optimization The Optimization Process: An example of portfolio optimization Shabbir Ahmed Fall 2002 1 Introduction Optimization can be roughly defined as a quantitative approach
More informationAn Online Algorithm for Multi-Strategy Trading Utilizing Market Regimes
An Online Algorithm for Multi-Strategy Trading Utilizing Market Regimes Hynek Mlnařík 1 Subramanian Ramamoorthy 2 Rahul Savani 1 1 Warwick Institute for Financial Computing Department of Computer Science
More informationAppendix to: AMoreElaborateModel
Appendix to: Why Do Demand Curves for Stocks Slope Down? AMoreElaborateModel Antti Petajisto Yale School of Management February 2004 1 A More Elaborate Model 1.1 Motivation Our earlier model provides a
More informationReinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration
Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision
More informationNotes on Macroeconomic Theory II
Notes on Macroeconomic Theory II Chao Wei Department of Economics George Washington University Washington, DC 20052 January 2007 1 1 Deterministic Dynamic Programming Below I describe a typical dynamic
More informationPh.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program August 2017
Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program August 2017 The time limit for this exam is four hours. The exam has four sections. Each section includes two questions.
More informationEfficient Market Making via Convex Optimization, and a Connection to Online Learning
Efficient Market Making via Convex Optimization, and a Connection to Online Learning by J. Abernethy, Y. Chen and J.W. Vaughan Presented by J. Duraj and D. Rishi 1 / 16 Outline 1 Motivation 2 Reasonable
More informationECON 6022B Problem Set 2 Suggested Solutions Fall 2011
ECON 60B Problem Set Suggested Solutions Fall 0 September 7, 0 Optimal Consumption with A Linear Utility Function (Optional) Similar to the example in Lecture 3, the household lives for two periods and
More informationModeling the extremes of temperature time series. Debbie J. Dupuis Department of Decision Sciences HEC Montréal
Modeling the extremes of temperature time series Debbie J. Dupuis Department of Decision Sciences HEC Montréal Outline Fig. 1: S&P 500. Daily negative returns (losses), Realized Variance (RV) and Jump
More informationResolution of a Financial Puzzle
Resolution of a Financial Puzzle M.J. Brennan and Y. Xia September, 1998 revised November, 1998 Abstract The apparent inconsistency between the Tobin Separation Theorem and the advice of popular investment
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationExpected utility theory; Expected Utility Theory; risk aversion and utility functions
; Expected Utility Theory; risk aversion and utility functions Prof. Massimo Guidolin Portfolio Management Spring 2016 Outline and objectives Utility functions The expected utility theorem and the axioms
More informationResource Allocation within Firms and Financial Market Dislocation: Evidence from Diversified Conglomerates
Resource Allocation within Firms and Financial Market Dislocation: Evidence from Diversified Conglomerates Gregor Matvos and Amit Seru (RFS, 2014) Corporate Finance - PhD Course 2017 Stefan Greppmair,
More informationIdentification and Estimation of Dynamic Games when Players Belief Are Not in Equilibrium
Identification and Estimation of Dynamic Games when Players Belief Are Not in Equilibrium A Short Review of Aguirregabiria and Magesan (2010) January 25, 2012 1 / 18 Dynamics of the game Two players, {i,
More informationLecture outline W.B.Powell 1
Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous
More informationA potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples
1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the
More informationPoint Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel
STATISTICS Lecture no. 10 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 8. 12. 2009 Introduction Suppose that we manufacture lightbulbs and we want to state
More informationEco504 Spring 2010 C. Sims FINAL EXAM. β t 1 2 φτ2 t subject to (1)
Eco54 Spring 21 C. Sims FINAL EXAM There are three questions that will be equally weighted in grading. Since you may find some questions take longer to answer than others, and partial credit will be given
More informationHeterogeneous Hidden Markov Models
Heterogeneous Hidden Markov Models José G. Dias 1, Jeroen K. Vermunt 2 and Sofia Ramos 3 1 Department of Quantitative methods, ISCTE Higher Institute of Social Sciences and Business Studies, Edifício ISCTE,
More informationSupplemental Online Appendix to Han and Hong, Understanding In-House Transactions in the Real Estate Brokerage Industry
Supplemental Online Appendix to Han and Hong, Understanding In-House Transactions in the Real Estate Brokerage Industry Appendix A: An Agent-Intermediated Search Model Our motivating theoretical framework
More informationJEFF MACKIE-MASON. x is a random variable with prior distrib known to both principal and agent, and the distribution depends on agent effort e
BASE (SYMMETRIC INFORMATION) MODEL FOR CONTRACT THEORY JEFF MACKIE-MASON 1. Preliminaries Principal and agent enter a relationship. Assume: They have access to the same information (including agent effort)
More informationSequential Decision Making
Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming
More informationNotes on the EM Algorithm Michael Collins, September 24th 2005
Notes on the EM Algorithm Michael Collins, September 24th 2005 1 Hidden Markov Models A hidden Markov model (N, Σ, Θ) consists of the following elements: N is a positive integer specifying the number of
More informationLecture 14 Consumption under Uncertainty Ricardian Equivalence & Social Security Dynamic General Equilibrium. Noah Williams
Lecture 14 Consumption under Uncertainty Ricardian Equivalence & Social Security Dynamic General Equilibrium Noah Williams University of Wisconsin - Madison Economics 702 Extensions of Permanent Income
More informationDynamic Portfolio Execution Detailed Proofs
Dynamic Portfolio Execution Detailed Proofs Gerry Tsoukalas, Jiang Wang, Kay Giesecke March 16, 2014 1 Proofs Lemma 1 (Temporary Price Impact) A buy order of size x being executed against i s ask-side
More informationCSCI 1951-G Optimization Methods in Finance Part 07: Portfolio Optimization
CSCI 1951-G Optimization Methods in Finance Part 07: Portfolio Optimization March 9 16, 2018 1 / 19 The portfolio optimization problem How to best allocate our money to n risky assets S 1,..., S n with
More information6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2
6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Review Examples of Pure Strategy Nash Equilibria Mixed Strategies
More informationChapter 3. Dynamic discrete games and auctions: an introduction
Chapter 3. Dynamic discrete games and auctions: an introduction Joan Llull Structural Micro. IDEA PhD Program I. Dynamic Discrete Games with Imperfect Information A. Motivating example: firm entry and
More informationAdaptive Experiments for Policy Choice. March 8, 2019
Adaptive Experiments for Policy Choice Maximilian Kasy Anja Sautmann March 8, 2019 Introduction The goal of many experiments is to inform policy choices: 1. Job search assistance for refugees: Treatments:
More informationMicroeconomic Theory May 2013 Applied Economics. Ph.D. PRELIMINARY EXAMINATION MICROECONOMIC THEORY. Applied Economics Graduate Program.
Ph.D. PRELIMINARY EXAMINATION MICROECONOMIC THEORY Applied Economics Graduate Program May 2013 *********************************************** COVER SHEET ***********************************************
More informationOptimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models
Optimally Thresholded Realized Power Variations for Lévy Jump Diffusion Models José E. Figueroa-López 1 1 Department of Statistics Purdue University University of Missouri-Kansas City Department of Mathematics
More information