Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach. K. Sudhir MGT 756: Empirical Methods in Marketing

Size: px
Start display at page:

Download "Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach. K. Sudhir MGT 756: Empirical Methods in Marketing"

Transcription

1 Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach K. Sudhir MGT 756: Empirical Methods in Marketing

2 RUST (1987) MODEL AND ESTIMATION APPROACH

3 A Model of Harold Zurcher Rust (1987) Empirical Model of Engine Maintenance/ Replacement Decisions Tradeoff High Cost to Replace, but low future cost to maintain No cost of replacement, but higher cost of maintenance Optimal Stopping Problem Structure of Solution Replace if miles exceed a threshold, else continue with old machine

4 Optimal Stopping Problems: Examples Search Models Marketing When to stop search and buy a new product (e.g., camera, ipod) Threshold: Has the price dropped enough? Labor Economics When to stop searching and accept a job Threshold: Have you got a high enough salary?

5 The Model

6 Notation

7 Notes

8 Value Function

9 Solving the DO Problem Bellman Equation See Similarities to how you would have done estimation for the static G&L Model if only you knew the Value Function V

10 Choice Specific Value Function Notice the ~ We solve this value function using a recursive technique called value function iteration

11 Note: Infinite Horizon vs Finite Horizon Dynamic Optimization (DO) Problems This is an Infinite Horizon Problem We assume stationarity (time homogeneity) We will solve the DO using value function iteration We also have finite horizon problems They are non-stationary DO Solved using backward induction from final period E.g., sales-force compensation problem in Chung et al.

12 Going Back to Rust Additive Separability

13 What do we mean by structural errors?

14 Parameters to be Estimated Discount Factor beta is not typically estimated. Essentially there is an identification problem

15 The Identification Problem (Magnac and Thesmar 2002)

16 What kinds of data may help identify discount factor One needs variables that does not affect current payoff, but only future payoffs Chevalier and Goolsbee (2005) Student choice of purchasing new text depends on whether a new edition is to be released soon Chung, Steenburgh and Sudhir (2009) Effort is related to not just current payoffs but also how far one is from a future bonus

17 ECONOMETRIC MODEL

18 Econometric Model Data: Buses are assumed homogeneous and independent

19 Assumptions Transition probabilities are Markov Conditional Independence Two types of conditional Independence Given x, e independent over time Conditional on x and i, x is independent of e

20 Likelihood Function for a Bus Markovian Assumption Conditional Independence Assumption

21 Log-Likelihood of the Model Likelihood separates into two components: 1. Structural parameters: maintenance cost and engine replacement cost 2. Markov Transition Probabilities Therefore you can do estimation in two independent steps (this is not two step estimation you hear of ) 1. Estimate transition probabilities 3 (relatively easy empirical frequencies) 2. Estimate the other structural parameters

22 Estimating Assumption: i.i.d. Extreme Value for e Dynamic Logit Model

23 General Notation Where

24 Estimation Method for Second Step: Nested Fixed Point Algorithm

25 Doing the inner Loop Trick: Iterate over the Expected Value Function, rather than value function This avoids having to compute value functions at values of e 0, e 1 that are additional state vars

26 Getting the Expected Value Function Expectation of the maximum for extreme value distribution

27 Expected Value Function Discrete state space: Take a weighted sum over the probabilities of being in each state Continuous state space: Discretize the state space and do interpolation to estimate EV at other state space values

28 Value Function Iteration Let t index iterations Iterate over Till the EV on LHS and RHS are within tolerance

29 Summary: Assumptions Key Assumptions Additive Separability of Error Term (AS) Markov Errors Conditional Independence (CI) i.i.d. errors (no serial correlation in errors) Extreme Value Distribution (convenient)

30 Summary: Estimation Discount Factor is Assumed Transition Probabilities estimated non-parametrically for continuous states empirical frequencies for discrete states Key Structural Parameters Using Nested Fixed Point Algorithm Outer loop estimate theta Inner Loop Solve DP Value Function Iteration for infinite horizon Backward Induction for finite horizon

31 PROGRAMMING EXERCISE

32 Programming a Simple Optimal Stopping Problem: Rust Adaptation Pay off Function Where a t is age of machine at time t i t =1 (replace), 0 (not replace) R-replacement cost 1 a t -maintenance cost of machine of age a t

33 State Space Evolution

34 Loading and setting up the Data % Load Data load rust N=rows(rust) % Convert Data into a Choice Vector Ch=zeros(N,2); Ch(:,1)=(rust(:,2)==0); Ch(:,2)=(rust(:,2)==1); %Setting up the number of states, actions, and discount factor NState=5; NAct=2; beta=0.9; %Starting Values theta=[-2;-3];

35 Setting up the Action-Specific Transition Probabilities F0= [ ; ; ; ; ]; F1= [ ; ; ; ; ]; This is completely deterministic in this problem. So we are not estimating it. If it were unknown, we would have to estimate the probabilities from the data

36 Maximizing Likelihood Function (Lik) [theta,fval,exitflag,output,grad,hessian]= fminunc('lik',theta,options);

37 Likelihood Function function f=lik(theta) global beta NState NAct differ F0 F1 N Ch rust %Static Period Utility U=zeros(N, NAct); U(:,1)=theta(1)*rust(:,1); U(:,2)=theta(2); %Expected Value Function through Value Function Iteration EV=ValFuncIter(theta); %The dynamic logit part of the model V(:,1)=exp(U(:,1)+beta*EV(rust(:,1),1)+1e-30); V(:,2)=exp(U(:,2)+beta*EV(rust(:,1),2)+1e-30); SV=sum(V,2); Prob=V./SV(:,[1 1]); % negative log Likelihood f=-sum(sum((log(prob).*ch)));

38 function EV=ValFuncIter(theta) global beta NState NAct F0 F1 %Setting up Toler=0.5; EV1=zeros(NState, NAct); EV0=zeros(NState, NAct); Value Function Iteration %State grid is discrete already, so easy, but otherwise create a grid StateGrid=seqa(1,1,5); %Filling out the static part of the utility at all combinations of states and actions U=zeros(NState, NAct); U(:,1)=theta(1)*StateGrid; U(:,2)=theta(2); %Doing the value function iterations while(abs(toler)>0.0005) EV1(:,1)= F0*log(sum(exp(U+beta*EV0),2)); EV1(:,2)= F1*log(sum(exp(U+beta*EV0),2)); Toler=max(max(abs(EV1-EV0))) EV0=EV1; end; %Returning the value function at all states and actions EV=EV1;

39 HOMEWORK EXERCISE

40 Homework (to be done by Next Friday) See posted assignment of Holger Seig: Part A Questions 1, 2 These can be easily done by following lecture Questions 3 and 4 You have to do Value function iteration to solve, which is essentially in the code I gave you Question 5: Given you the answer with the program Question 6: Answer and Program Question 6 a Simply answer questions 6b and 6c.

Pakes (1986): Patents as Options: Some Estimates of the Value of Holding European Patent Stocks

Pakes (1986): Patents as Options: Some Estimates of the Value of Holding European Patent Stocks Pakes (1986): Patents as Options: Some Estimates of the Value of Holding European Patent Stocks Spring 2009 Main question: How much are patents worth? Answering this question is important, because it helps

More information

Obtaining Analytic Derivatives for a Class of Discrete-Choice Dynamic Programming Models

Obtaining Analytic Derivatives for a Class of Discrete-Choice Dynamic Programming Models Obtaining Analytic Derivatives for a Class of Discrete-Choice Dynamic Programming Models Curtis Eberwein John C. Ham June 5, 2007 Abstract This paper shows how to recursively calculate analytic first and

More information

Unobserved Heterogeneity Revisited

Unobserved Heterogeneity Revisited Unobserved Heterogeneity Revisited Robert A. Miller Dynamic Discrete Choice March 2018 Miller (Dynamic Discrete Choice) cemmap 7 March 2018 1 / 24 Distributional Assumptions about the Unobserved Variables

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Dynamic Portfolio Choice II

Dynamic Portfolio Choice II Dynamic Portfolio Choice II Dynamic Programming Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Dynamic Portfolio Choice II 15.450, Fall 2010 1 / 35 Outline 1 Introduction to Dynamic

More information

Making Complex Decisions

Making Complex Decisions Ch. 17 p.1/29 Making Complex Decisions Chapter 17 Ch. 17 p.2/29 Outline Sequential decision problems Value iteration algorithm Policy iteration algorithm Ch. 17 p.3/29 A simple environment 3 +1 p=0.8 2

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

Identification and Estimation of Dynamic Games when Players Beliefs are not in Equilibrium

Identification and Estimation of Dynamic Games when Players Beliefs are not in Equilibrium and of Dynamic Games when Players Beliefs are not in Equilibrium Victor Aguirregabiria and Arvind Magesan Presented by Hanqing Institute, Renmin University of China Outline General Views 1 General Views

More information

Non-Deterministic Search

Non-Deterministic Search Non-Deterministic Search MDP s 1 Non-Deterministic Search How do you plan (search) when your actions might fail? In general case, how do you plan, when the actions have multiple possible outcomes? 2 Example:

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the reward function Must (learn to) act so as to maximize expected rewards Grid World The agent

More information

17 MAKING COMPLEX DECISIONS

17 MAKING COMPLEX DECISIONS 267 17 MAKING COMPLEX DECISIONS The agent s utility now depends on a sequence of decisions In the following 4 3grid environment the agent makes a decision to move (U, R, D, L) at each time step When the

More information

Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs. SS223B-Empirical IO

Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs. SS223B-Empirical IO Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs SS223B-Empirical IO Motivation There have been substantial recent developments in the empirical literature on

More information

A DYNAMIC DISCRETE-CONTINUOUS CHOICE MODEL FOR CAR OWNERSHIP AND USAGE ESTIMATION PROCEDURE

A DYNAMIC DISCRETE-CONTINUOUS CHOICE MODEL FOR CAR OWNERSHIP AND USAGE ESTIMATION PROCEDURE A DYNAMIC DISCRETE-CONTINUOUS CHOICE MODEL FOR CAR OWNERSHIP AND USAGE ESTIMATION PROCEDURE Aurélie Glerum EPFL Emma Frejinger Université de Montréal Anders Karlström KTH Muriel Beser Hugosson KTH Michel

More information

Markov Decision Processes: Making Decision in the Presence of Uncertainty. (some of) R&N R&N

Markov Decision Processes: Making Decision in the Presence of Uncertainty. (some of) R&N R&N Markov Decision Processes: Making Decision in the Presence of Uncertainty (some of) R&N 16.1-16.6 R&N 17.1-17.4 Different Aspects of Machine Learning Supervised learning Classification - concept learning

More information

Stochastic Optimal Control

Stochastic Optimal Control Stochastic Optimal Control Lecturer: Eilyan Bitar, Cornell ECE Scribe: Kevin Kircher, Cornell MAE These notes summarize some of the material from ECE 5555 (Stochastic Systems) at Cornell in the fall of

More information

Chapter 3. Dynamic discrete games and auctions: an introduction

Chapter 3. Dynamic discrete games and auctions: an introduction Chapter 3. Dynamic discrete games and auctions: an introduction Joan Llull Structural Micro. IDEA PhD Program I. Dynamic Discrete Games with Imperfect Information A. Motivating example: firm entry and

More information

CPS 270: Artificial Intelligence Markov decision processes, POMDPs

CPS 270: Artificial Intelligence  Markov decision processes, POMDPs CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Markov decision processes, POMDPs Instructor: Vincent Conitzer Warmup: a Markov process with rewards We derive some reward

More information

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming Dynamic Programming: An overview These notes summarize some key properties of the Dynamic Programming principle to optimize a function or cost that depends on an interval or stages. This plays a key role

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 9: MDPs 2/16/2011 Pieter Abbeel UC Berkeley Many slides over the course adapted from either Dan Klein, Stuart Russell or Andrew Moore 1 Announcements

More information

Lecture 12: MDP1. Victor R. Lesser. CMPSCI 683 Fall 2010

Lecture 12: MDP1. Victor R. Lesser. CMPSCI 683 Fall 2010 Lecture 12: MDP1 Victor R. Lesser CMPSCI 683 Fall 2010 Biased Random GSAT - WalkSat Notice no random restart 2 Today s lecture Search where there is Uncertainty in Operator Outcome --Sequential Decision

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Sep, 28, 2016 Slide 1 CPSC 422, Lecture 9 An MDP Approach to Multi-Category Patient Scheduling in a Diagnostic Facility Adapted from: Matthew

More information

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008

Optimal Stopping. Nick Hay (presentation follows Thomas Ferguson s Optimal Stopping and Applications) November 6, 2008 (presentation follows Thomas Ferguson s and Applications) November 6, 2008 1 / 35 Contents: Introduction Problems Markov Models Monotone Stopping Problems Summary 2 / 35 The Secretary problem You have

More information

2D5362 Machine Learning

2D5362 Machine Learning 2D5362 Machine Learning Reinforcement Learning MIT GALib Available at http://lancet.mit.edu/ga/ download galib245.tar.gz gunzip galib245.tar.gz tar xvf galib245.tar cd galib245 make or access my files

More information

Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function?

Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function? DOI 0.007/s064-006-9073-z ORIGINAL PAPER Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function? Jules H. van Binsbergen Michael W. Brandt Received:

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018 Lecture 2: Making Good Sequences of Decisions Given a Model of World CS234: RL Emma Brunskill Winter 218 Human in the loop exoskeleton work from Steve Collins lab Class Structure Last Time: Introduction

More information

COMP417 Introduction to Robotics and Intelligent Systems. Reinforcement Learning - 2

COMP417 Introduction to Robotics and Intelligent Systems. Reinforcement Learning - 2 COMP417 Introduction to Robotics and Intelligent Systems Reinforcement Learning - 2 Speaker: Sandeep Manjanna Acklowledgement: These slides use material from Pieter Abbeel s, Dan Klein s and John Schulman

More information

Handout 4: Deterministic Systems and the Shortest Path Problem

Handout 4: Deterministic Systems and the Shortest Path Problem SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 4: Deterministic Systems and the Shortest Path Problem Instructor: Shiqian Ma January 27, 2014 Suggested Reading: Bertsekas

More information

Semiparametric Estimation of a Finite Horizon Dynamic Discrete Choice Model with a Terminating Action 1

Semiparametric Estimation of a Finite Horizon Dynamic Discrete Choice Model with a Terminating Action 1 Semiparametric Estimation of a Finite Horizon Dynamic Discrete Choice Model with a Terminating Action 1 Patrick Bajari, University of Washington and NBER Chenghuan Sean Chu, Facebook Denis Nekipelov, University

More information

Dynamic Pricing with Varying Cost

Dynamic Pricing with Varying Cost Dynamic Pricing with Varying Cost L. Jeff Hong College of Business City University of Hong Kong Joint work with Ying Zhong and Guangwu Liu Outline 1 Introduction 2 Problem Formulation 3 Pricing Policy

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Dynamic programming Christos Dimitrakakis Intelligent Autonomous Systems, IvI, University of Amsterdam, The Netherlands March 18, 2008 Introduction Some examples Dynamic programming

More information

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions Optimality and Approximation Finite MDP: {S, A, R, p, γ}

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives

More information

A VALUE-BASED APPROACH FOR COMMERCIAL AIRCRAFT CONCEPTUAL DESIGN

A VALUE-BASED APPROACH FOR COMMERCIAL AIRCRAFT CONCEPTUAL DESIGN ICAS2002 CONGRESS A VALUE-BASED APPROACH FOR COMMERCIAL AIRCRAFT CONCEPTUAL DESIGN Jacob Markish, Karen Willcox Massachusetts Institute of Technology Keywords: aircraft design, value, dynamic programming,

More information

EE266 Homework 5 Solutions

EE266 Homework 5 Solutions EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The

More information

Multi-armed bandit problems

Multi-armed bandit problems Multi-armed bandit problems Stochastic Decision Theory (2WB12) Arnoud den Boer 13 March 2013 Set-up 13 and 14 March: Lectures. 20 and 21 March: Paper presentations (Four groups, 45 min per group). Before

More information

Financial Time Series Analysis (FTSA)

Financial Time Series Analysis (FTSA) Financial Time Series Analysis (FTSA) Lecture 6: Conditional Heteroscedastic Models Few models are capable of generating the type of ARCH one sees in the data.... Most of these studies are best summarized

More information

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig]

Basic Framework. About this class. Rewards Over Time. [This lecture adapted from Sutton & Barto and Russell & Norvig] Basic Framework [This lecture adapted from Sutton & Barto and Russell & Norvig] About this class Markov Decision Processes The Bellman Equation Dynamic Programming for finding value functions and optimal

More information

CSEP 573: Artificial Intelligence

CSEP 573: Artificial Intelligence CSEP 573: Artificial Intelligence Markov Decision Processes (MDP)! Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer, Dan Klein, Pieter Abbeel, Stuart Russell or Andrew Moore 1 Outline

More information

A Simple and Robust Estimator for Discount Factors in Optimal Stopping Dynamic Discrete Choice Models

A Simple and Robust Estimator for Discount Factors in Optimal Stopping Dynamic Discrete Choice Models A Simple and Robust Estimator for Discount Factors in Optimal Stopping Dynamic Discrete Choice Models Øystein Daljord, Denis Nekipelov & Minjung Park April 10, 2018 Abstract We propose a simple and robust

More information

Introduction to Dynamic Programming

Introduction to Dynamic Programming Introduction to Dynamic Programming http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html Acknowledgement: this slides is based on Prof. Mengdi Wang s and Prof. Dimitri Bertsekas lecture notes Outline 2/65 1

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

Complex Decisions. Sequential Decision Making

Complex Decisions. Sequential Decision Making Sequential Decision Making Outline Sequential decision problems Value iteration Policy iteration POMDPs (basic concepts) Slides partially based on the Book "Reinforcement Learning: an introduction" by

More information

POMDPs: Partially Observable Markov Decision Processes Advanced AI

POMDPs: Partially Observable Markov Decision Processes Advanced AI POMDPs: Partially Observable Markov Decision Processes Advanced AI Wolfram Burgard Types of Planning Problems Classical Planning State observable Action Model Deterministic, accurate MDPs observable stochastic

More information

Ec101: Behavioral Economics

Ec101: Behavioral Economics Ec: Behavioral Economics Answer Key to Homework # 4 th May 7 Question One (i Bayesian updating Let ( L p We can assume (following Kahneman & Tversky 97, sychological Review that the only two choices are

More information

Implementing Models in Quantitative Finance: Methods and Cases

Implementing Models in Quantitative Finance: Methods and Cases Gianluca Fusai Andrea Roncoroni Implementing Models in Quantitative Finance: Methods and Cases vl Springer Contents Introduction xv Parti Methods 1 Static Monte Carlo 3 1.1 Motivation and Issues 3 1.1.1

More information

Agricultural and Applied Economics 637 Applied Econometrics II

Agricultural and Applied Economics 637 Applied Econometrics II Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non Deterministic Search Example: Grid World A maze like problem The agent lives in

More information

91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010

91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010 91.420/543: Artificial Intelligence UMass Lowell CS Fall 2010 Lecture 17 & 18: Markov Decision Processes Oct 12 13, 2010 A subset of Lecture 9 slides from Dan Klein UC Berkeley Many slides over the course

More information

Overview: Representation Techniques

Overview: Representation Techniques 1 Overview: Representation Techniques Week 6 Representations for classical planning problems deterministic environment; complete information Week 7 Logic programs for problem representations including

More information

Graduate Macro Theory II: Notes on Value Function Iteration

Graduate Macro Theory II: Notes on Value Function Iteration Graduate Macro Theory II: Notes on Value Function Iteration Eric Sims University of Notre Dame Spring 07 Introduction These notes discuss how to solve dynamic economic models using value function iteration.

More information

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1

Making Decisions. CS 3793 Artificial Intelligence Making Decisions 1 Making Decisions CS 3793 Artificial Intelligence Making Decisions 1 Planning under uncertainty should address: The world is nondeterministic. Actions are not certain to succeed. Many events are outside

More information

Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management. > Teaching > Courses

Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management.  > Teaching > Courses Master s in Financial Engineering Foundations of Buy-Side Finance: Quantitative Risk and Portfolio Management www.symmys.com > Teaching > Courses Spring 2008, Monday 7:10 pm 9:30 pm, Room 303 Attilio Meucci

More information

Handout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems

Handout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 8: Introduction to Stochastic Dynamic Programming Instructor: Shiqian Ma March 10, 2014 Suggested Reading: Chapter 1 of Bertsekas,

More information

Decision Theory: Value Iteration

Decision Theory: Value Iteration Decision Theory: Value Iteration CPSC 322 Decision Theory 4 Textbook 9.5 Decision Theory: Value Iteration CPSC 322 Decision Theory 4, Slide 1 Lecture Overview 1 Recap 2 Policies 3 Value Iteration Decision

More information

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods

EC316a: Advanced Scientific Computation, Fall Discrete time, continuous state dynamic models: solution methods EC316a: Advanced Scientific Computation, Fall 2003 Notes Section 4 Discrete time, continuous state dynamic models: solution methods We consider now solution methods for discrete time models in which decisions

More information

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE 6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE Rollout algorithms Cost improvement property Discrete deterministic problems Approximations of rollout algorithms Discretization of continuous time

More information

4 Reinforcement Learning Basic Algorithms

4 Reinforcement Learning Basic Algorithms Learning in Complex Systems Spring 2011 Lecture Notes Nahum Shimkin 4 Reinforcement Learning Basic Algorithms 4.1 Introduction RL methods essentially deal with the solution of (optimal) control problems

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline C 188: Artificial Intelligence Markov Decision Processes (MDPs) Pieter Abbeel UC Berkeley ome slides adapted from Dan Klein 1 Outline Markov Decision Processes (MDPs) Formalism Value iteration In essence

More information

6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE 6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE Stopping problems Scheduling problems Minimax Control 1 PURE STOPPING PROBLEMS Two possible controls: Stop (incur a one-time stopping cost, and move

More information

Appendix: Common Currencies vs. Monetary Independence

Appendix: Common Currencies vs. Monetary Independence Appendix: Common Currencies vs. Monetary Independence A The infinite horizon model This section defines the equilibrium of the infinity horizon model described in Section III of the paper and characterizes

More information

Lecture 5 January 30

Lecture 5 January 30 EE 223: Stochastic Estimation and Control Spring 2007 Lecture 5 January 30 Lecturer: Venkat Anantharam Scribe: aryam Kamgarpour 5.1 Secretary Problem The problem set-up is explained in Lecture 4. We review

More information

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29 Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29 Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting

More information

Reasoning with Uncertainty

Reasoning with Uncertainty Reasoning with Uncertainty Markov Decision Models Manfred Huber 2015 1 Markov Decision Process Models Markov models represent the behavior of a random process, including its internal state and the externally

More information

Dynamic Replication of Non-Maturing Assets and Liabilities

Dynamic Replication of Non-Maturing Assets and Liabilities Dynamic Replication of Non-Maturing Assets and Liabilities Michael Schürle Institute for Operations Research and Computational Finance, University of St. Gallen, Bodanstr. 6, CH-9000 St. Gallen, Switzerland

More information

TDT4171 Artificial Intelligence Methods

TDT4171 Artificial Intelligence Methods TDT47 Artificial Intelligence Methods Lecture 7 Making Complex Decisions Norwegian University of Science and Technology Helge Langseth IT-VEST 0 helgel@idi.ntnu.no TDT47 Artificial Intelligence Methods

More information

Econ 8602, Fall 2017 Homework 2

Econ 8602, Fall 2017 Homework 2 Econ 8602, Fall 2017 Homework 2 Due Tues Oct 3. Question 1 Consider the following model of entry. There are two firms. There are two entry scenarios in each period. With probability only one firm is able

More information

Revenue Management Under the Markov Chain Choice Model

Revenue Management Under the Markov Chain Choice Model Revenue Management Under the Markov Chain Choice Model Jacob B. Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853, USA jbf232@cornell.edu Huseyin

More information

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting Quantile Regression By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting Agenda Overview of Predictive Modeling for P&C Applications Quantile

More information

Deep RL and Controls Homework 1 Spring 2017

Deep RL and Controls Homework 1 Spring 2017 10-703 Deep RL and Controls Homework 1 Spring 2017 February 1, 2017 Due February 17, 2017 Instructions You have 15 days from the release of the assignment until it is due. Refer to gradescope for the exact

More information

Journal of Economics and Business

Journal of Economics and Business Journal of Economics and Business 66 (2013) 98 124 Contents lists available at SciVerse ScienceDirect Journal of Economics and Business Liquidity provision in a limit order book without adverse selection

More information

Final exam solutions

Final exam solutions EE365 Stochastic Control / MS&E251 Stochastic Decision Models Profs. S. Lall, S. Boyd June 5 6 or June 6 7, 2013 Final exam solutions This is a 24 hour take-home final. Please turn it in to one of the

More information

Lecture 6 Dynamic games with imperfect information

Lecture 6 Dynamic games with imperfect information Lecture 6 Dynamic games with imperfect information Backward Induction in dynamic games of imperfect information We start at the end of the trees first find the Nash equilibrium (NE) of the last subgame

More information

Identification and Estimation of Dynamic Games when Players Belief Are Not in Equilibrium

Identification and Estimation of Dynamic Games when Players Belief Are Not in Equilibrium Identification and Estimation of Dynamic Games when Players Belief Are Not in Equilibrium A Short Review of Aguirregabiria and Magesan (2010) January 25, 2012 1 / 18 Dynamics of the game Two players, {i,

More information

A simple wealth model

A simple wealth model Quantitative Macroeconomics Raül Santaeulàlia-Llopis, MOVE-UAB and Barcelona GSE Homework 5, due Thu Nov 1 I A simple wealth model Consider the sequential problem of a household that maximizes over streams

More information

FE501 Stochastic Calculus for Finance 1.5:0:1.5

FE501 Stochastic Calculus for Finance 1.5:0:1.5 Descriptions of Courses FE501 Stochastic Calculus for Finance 1.5:0:1.5 This course introduces martingales or Markov properties of stochastic processes. The most popular example of stochastic process is

More information

Opening Secondary Markets: A Durable Goods Oligopoly with Transaction Costs

Opening Secondary Markets: A Durable Goods Oligopoly with Transaction Costs Opening Secondary Markets: A Durable Goods Oligopoly with Transaction Costs Jiawei Chen Department of Economics UC-Irvine Susanna Esteban Department of Economics Universidad Carlos III de Madrid Matthew

More information

A DYNAMIC PROGRAMMING APPROACH TO MODEL THE RETIREMENT BEHAVIOUR OF BLUE-COLLAR WORKERS IN SWEDEN

A DYNAMIC PROGRAMMING APPROACH TO MODEL THE RETIREMENT BEHAVIOUR OF BLUE-COLLAR WORKERS IN SWEDEN JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 19: 795 807 (2004) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/jae.798 A DYNAMIC PROGRAMMING APPROACH TO MODEL THE RETIREMENT

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

Course information FN3142 Quantitative finance

Course information FN3142 Quantitative finance Course information 015 16 FN314 Quantitative finance This course is aimed at students interested in obtaining a thorough grounding in market finance and related empirical methods. Prerequisite If taken

More information

MDPs: Bellman Equations, Value Iteration

MDPs: Bellman Equations, Value Iteration MDPs: Bellman Equations, Value Iteration Sutton & Barto Ch 4 (Cf. AIMA Ch 17, Section 2-3) Adapted from slides kindly shared by Stuart Russell Sutton & Barto Ch 4 (Cf. AIMA Ch 17, Section 2-3) 1 Appreciations

More information

Robust Dual Dynamic Programming

Robust Dual Dynamic Programming 1 / 18 Robust Dual Dynamic Programming Angelos Georghiou, Angelos Tsoukalas, Wolfram Wiesemann American University of Beirut Olayan School of Business 31 May 217 2 / 18 Inspired by SDDP Stochastic optimization

More information

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence CSE 473: Artificial Intelligence Markov Decision Processes (MDPs) Luke Zettlemoyer Many slides over the course adapted from Dan Klein, Stuart Russell or Andrew Moore 1 Announcements PS2 online now Due

More information

Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo

Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo Markov Decision Processes (MDPs) CS 486/686 Introduction to AI University of Waterloo Outline Sequential Decision Processes Markov chains Highlight Markov property Discounted rewards Value iteration Markov

More information

Mengdi Wang. July 3rd, Laboratory for Information and Decision Systems, M.I.T.

Mengdi Wang. July 3rd, Laboratory for Information and Decision Systems, M.I.T. Practice July 3rd, 2012 Laboratory for Information and Decision Systems, M.I.T. 1 2 Infinite-Horizon DP Minimize over policies the objective cost function J π (x 0 ) = lim N E w k,k=0,1,... DP π = {µ 0,µ

More information

Methods Examination (Macro Part) Spring Please answer all the four questions below. The exam has 100 points.

Methods Examination (Macro Part) Spring Please answer all the four questions below. The exam has 100 points. Methods Examination (Macro Part) Spring 2006 Please answer all the four questions below. The exam has 100 points. 1) Infinite Horizon Economy with Durables, Money, and Taxes (Total 40 points) Consider

More information

ADVANCED MACROECONOMIC TECHNIQUES NOTE 7b

ADVANCED MACROECONOMIC TECHNIQUES NOTE 7b 316-406 ADVANCED MACROECONOMIC TECHNIQUES NOTE 7b Chris Edmond hcpedmond@unimelb.edu.aui Aiyagari s model Arguably the most popular example of a simple incomplete markets model is due to Rao Aiyagari (1994,

More information

Problem Set 2: Answers

Problem Set 2: Answers Economics 623 J.R.Walker Page 1 Problem Set 2: Answers The problem set came from Michael A. Trick, Senior Associate Dean, Education and Professor Tepper School of Business, Carnegie Mellon University.

More information

Support Vector Machines: Training with Stochastic Gradient Descent

Support Vector Machines: Training with Stochastic Gradient Descent Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM

More information

Global and National Macroeconometric Modelling: A Long-run Structural Approach Overview on Macroeconometric Modelling Yongcheol Shin Leeds University

Global and National Macroeconometric Modelling: A Long-run Structural Approach Overview on Macroeconometric Modelling Yongcheol Shin Leeds University Global and National Macroeconometric Modelling: A Long-run Structural Approach Overview on Macroeconometric Modelling Yongcheol Shin Leeds University Business School Seminars at University of Cape Town

More information

GRANULARITY ADJUSTMENT FOR DYNAMIC MULTIPLE FACTOR MODELS : SYSTEMATIC VS UNSYSTEMATIC RISKS

GRANULARITY ADJUSTMENT FOR DYNAMIC MULTIPLE FACTOR MODELS : SYSTEMATIC VS UNSYSTEMATIC RISKS GRANULARITY ADJUSTMENT FOR DYNAMIC MULTIPLE FACTOR MODELS : SYSTEMATIC VS UNSYSTEMATIC RISKS Patrick GAGLIARDINI and Christian GOURIÉROUX INTRODUCTION Risk measures such as Value-at-Risk (VaR) Expected

More information

Fast Convergence of Regress-later Series Estimators

Fast Convergence of Regress-later Series Estimators Fast Convergence of Regress-later Series Estimators New Thinking in Finance, London Eric Beutner, Antoon Pelsser, Janina Schweizer Maastricht University & Kleynen Consultants 12 February 2014 Beutner Pelsser

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA Stochastic domains Image: Berkeley CS188 course notes (downloaded Summer

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 253 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action a will have possible outcome states Result(a)

More information

Optimal Switching Games in Emissions Trading

Optimal Switching Games in Emissions Trading Emissions Trading Numerics Conclusion Optimal in Emissions Trading Mike Department of Statistics & Applied Probability University of California Santa Barbara Bachelier Congress, June 24, 2010 1 / 26 Emissions

More information

IEOR E4004: Introduction to OR: Deterministic Models

IEOR E4004: Introduction to OR: Deterministic Models IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the

More information

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 15 : DP Examples

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 15 : DP Examples CE 191: Civil and Environmental Engineering Systems Analysis LEC 15 : DP Examples Professor Scott Moura Civil & Environmental Engineering University of California, Berkeley Fall 2014 Prof. Moura UC Berkeley

More information