Family Vacation. c 1 = c n = 0. w: maximum number of miles the family may drive each day.

Size: px
Start display at page:

Download "Family Vacation. c 1 = c n = 0. w: maximum number of miles the family may drive each day."

Transcription

1 II-0 Family Vacation Set of cities denoted by P 1, P 2,..., P n. d i : Distance from P i 1 to P i (1 < i n). d 1 = 0 c i : Cost of dinner, lodging and breakfast when staying at city P i (1 < i < n). c 1 = c n = 0. w: maximum number of miles the family may drive each day. Assume that for each i, d i w. Problem: Find the cities the family needs to stay overnight when driving from city P 1 to P n (using the route P 1, P 2,..., P n ) such that every day the family drives at most w miles and the total cost of the overnight stays is least possible.

2 II-1 Greedy Method 1 Drive as much as possible each day. Method does not always generate an optimal solution. Counterexample (w = 300). Cities P 1 P 2 P 3 P 4 P 5 P 6 P 7 d i c i Greedy solution: Stay at P 3 and P 5. Total Cost is = Optimal Solution: Stay at P 2, P 4 and P 6. Total cost is = 60. Therefore this greedy method does not generate an optimal solution. However this greedy method minimizes the number of overnight stays.

3 II-2 Greedy Method 2 Stay overnight at the least expensive place within w miles from the previous overnight stay. (Ties?) Stay at the city with least cost that is farthest from the previous stay. Method does not always generate an optimal solution. Couterexample (w = 50) Cities P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 d i c i Greedy solution: Stay at P 2, P 3, P 4, P 5, P 6, and P 7. Total cost is = 315. Better Solution: Stay at P 6 and P 7. Cost = 109. This greedy method is not optimal.

4 II-3 Greedy Method 3 Stay overnight at the city (at a distance at most w from previous stay) where the cost of the overnight stay divided by the miles driven is least possible. (Ties?) Stay at the city with least cost that is farthest from the previous stay. Method does not always generate an optimal solution. Couterexample (w = 50) Cities P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 d i c i $/dist: 50/10=5, 51/20=2.55, 52/30=1.73, 53/40=1.32, 54/50=1.08. Greedy solution: Stay at P 6 and P 7. Cost = 109. Better Solution: Stay at P 2 and P 7. Cost = 105.

5 II-4 This greedy method is not optimal.

6 II-5 Dynamic Programming Approach M i : Minimum cost for the overnight stays when driving from P i to P n with the restriction that the family must drive at most w miles. This assumes that the family starts at P i. Objective is to compute M 1 and then figure out the places where the family stays overnight in an optimal solution with cost M i. Clearly, M n = 0.

7 II-6 Recurrence Relation M i = min c i+1 + M i+1 c i+2 + M i c j + M j if d i+1 w if d i+1 + d i+2 w if j k=i+1 d k w where either j = n or j+1 k=i+1 d k > w Note that the above equations means that one would drive and stay at P i+1, or P i+2, or..., or P j.

8 II-7 Example Cities P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 d i c i w is equal to 300. By definition M 8 = 0 { M 7 = min c 8 + M 8 (90 w = 300) { M 7 = 0 min M 6 = min c 7 + M 7 (150 w = 300) c 8 + M 8 ( w = 300)

9 II-8 M 6 = 0 min

10 II-9 M 5 = min c 6 + M 6 (100 w = 300) c 7 + M 7 ( w = 300) M 5 = 30 min M 4 = min c 5 + M 5 (70 w = 300) c 6 + M 6 ( w = 300) M 4 = 30 min

11 II-10 M 3 = min c 4 + M 4 (100 w = 300) c 5 + M 5 ( w = 300) c 6 + M 6 ( w = 300) M 3 = 30 min M 2 = min c 3 + M 3 (100 w = 300) c 4 + M 4 ( w = 300) c 5 + M 5 ( w = 300) M 2 = 90 min

12 II-11 M 1 = min c 2 + M 2 (150 w = 300) c 3 + M 3 ( w = 300) M 1 = 130 min Optimal Solutions: Stay Overnight at: P 1, P 2, P 4, P 6, P 8. Cost is = 130. P 1, P 3, P 6, P 8. Cost is = 130.

13 II-12 Exponential Time Algorithm Main program invokes M(1). Note that c j is written as c[j] and d j as d[j] in the following procedures. int Procedure M(i) //Exponential time algorithm if i == n return 0 cost = +infinity dist = 0 for j=i+1 to n do dist = dist + d[j] if dist <= w then cost = min{cost, M(j)+c[j]} else exit j-loop endfor return cost end procedure

14 II-13 Polynomial Time Algorithm Main program invokes M(1). Note that c j is written as c[j] and d j as d[j] in the following procedures. Avoid recomputation of M values. define M[i] = +infinity define M[n] = 0 1 <= i < n int Procedure M(i) //Polynomial time algorithm if M[i] is finite then return M[i] dist = 0 for j=i+1 to n do dist = dist + d[j] if dist <= w then M[i] = min{m[i], M(j)+c[j]} else exit j-loop endfor return M[i] end procedure

15 II-14 Iterative Algorithm Main program invokes M(1). Compute the M values iteratively. int Procedure Iterative-Compute define M[n] = 0 for i=n-1 to 1 by -1 do M[i] = +infinity dist = 0 for j=i+1 to n do dist = dist + d[j] if dist <= w then M[i] = min{m[i], M[j]+c[j]} else exit j-loop endfor endfor end procedure The time complexity of the inside loop is O(n). Since it is executed n 1 times, then the time complexity O(n 2 ).

16 II-15 Printing the Overnight Locations Main program invokes M(1). Note that c j is written as c[j] and d j as d[j] in the following procedure. Compute the M values iteratively. int Procedure Iterative-Compute-Print Invoke the iterative algorithm. Assume we have the M[i] values. i = 1 While i <> n do for j=i+1 to n do if M[i] == M[j]+c[j] then print j i = j exit j-loop endfor endwhile end procedure The time complexity for the print part is O(n 2 ).

17 II-16 Without Recomputation Main program invokes M(1). Compute the M and kay values iteratively. kay[i] stores the next city for the overnight stay in an optimal solution for M i. int Procedure Iterative-Compute-Print define M[n] = 0 for i=n-1 to 1 by -1 do M[i] = +infinity dist = 0 for j=i+1 to n do dist = dist + d[j] if dist <= w then if M[j]+c[j] < M[i] then kay[i] = j M[i] = min{m[i], M[j]+c[j]} else exit j-loop endfor endfor

18 II-17 /* Printing Part */ i = 1 While i <> n do print kay[i] i = kay[i] endwhile end procedure The time complexity for the print part is O(n) and the computation of the M[i] and kay[i] values is O(n 2 ).

UNIT 2. Greedy Method GENERAL METHOD

UNIT 2. Greedy Method GENERAL METHOD UNIT 2 GENERAL METHOD Greedy Method Greedy is the most straight forward design technique. Most of the problems have n inputs and require us to obtain a subset that satisfies some constraints. Any subset

More information

1) S = {s}; 2) for each u V {s} do 3) dist[u] = cost(s, u); 4) Insert u into a 2-3 tree Q with dist[u] as the key; 5) for i = 1 to n 1 do 6) Identify

1) S = {s}; 2) for each u V {s} do 3) dist[u] = cost(s, u); 4) Insert u into a 2-3 tree Q with dist[u] as the key; 5) for i = 1 to n 1 do 6) Identify CSE 3500 Algorithms and Complexity Fall 2016 Lecture 17: October 25, 2016 Dijkstra s Algorithm Dijkstra s algorithm for the SSSP problem generates the shortest paths in nondecreasing order of the shortest

More information

CS360 Homework 14 Solution

CS360 Homework 14 Solution CS360 Homework 14 Solution Markov Decision Processes 1) Invent a simple Markov decision process (MDP) with the following properties: a) it has a goal state, b) its immediate action costs are all positive,

More information

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class Homework #4 CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class o Grades depend on neatness and clarity. o Write your answers with enough detail about your approach and concepts

More information

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions

The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions The Agent-Environment Interface Goals, Rewards, Returns The Markov Property The Markov Decision Process Value Functions Optimal Value Functions Optimality and Approximation Finite MDP: {S, A, R, p, γ}

More information

Probabilistic Robotics: Probabilistic Planning and MDPs

Probabilistic Robotics: Probabilistic Planning and MDPs Probabilistic Robotics: Probabilistic Planning and MDPs Slide credits: Wolfram Burgard, Dieter Fox, Cyrill Stachniss, Giorgio Grisetti, Maren Bennewitz, Christian Plagemann, Dirk Haehnel, Mike Montemerlo,

More information

CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS

CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS CMPSCI 311: Introduction to Algorithms Second Midterm Practice Exam SOLUTIONS November 17, 2016. Name: ID: Instructions: Answer the questions directly on the exam pages. Show all your work for each question.

More information

IEOR E4004: Introduction to OR: Deterministic Models

IEOR E4004: Introduction to OR: Deterministic Models IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the

More information

Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts.

Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts. Page 1 Dynamic Programming cont. We repeat: The Dynamic Programming Template has three parts. Subproblems Sometimes this is enough if the algorithm and its complexity is obvious. Recursion Algorithm Must

More information

Lecture 6. 1 Polynomial-time algorithms for the global min-cut problem

Lecture 6. 1 Polynomial-time algorithms for the global min-cut problem ORIE 633 Network Flows September 20, 2007 Lecturer: David P. Williamson Lecture 6 Scribe: Animashree Anandkumar 1 Polynomial-time algorithms for the global min-cut problem 1.1 The global min-cut problem

More information

June 11, Dynamic Programming( Weighted Interval Scheduling)

June 11, Dynamic Programming( Weighted Interval Scheduling) Dynamic Programming( Weighted Interval Scheduling) June 11, 2014 Problem Statement: 1 We have a resource and many people request to use the resource for periods of time (an interval of time) 2 Each interval

More information

Handout 4: Deterministic Systems and the Shortest Path Problem

Handout 4: Deterministic Systems and the Shortest Path Problem SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 4: Deterministic Systems and the Shortest Path Problem Instructor: Shiqian Ma January 27, 2014 Suggested Reading: Bertsekas

More information

Markov Decision Processes II

Markov Decision Processes II Markov Decision Processes II Daisuke Oyama Topics in Economic Theory December 17, 2014 Review Finite state space S, finite action space A. The value of a policy σ A S : v σ = β t Q t σr σ, t=0 which satisfies

More information

useful than solving these yourself, writing up your solution and then either comparing your

useful than solving these yourself, writing up your solution and then either comparing your CSE 441T/541T: Advanced Algorithms Fall Semester, 2003 September 9, 2004 Practice Problems Solutions Here are the solutions for the practice problems. However, reading these is far less useful than solving

More information

AN ALGORITHM FOR FINDING SHORTEST ROUTES FROM ALL SOURCE NODES TO A GIVEN DESTINATION IN GENERAL NETWORKS*

AN ALGORITHM FOR FINDING SHORTEST ROUTES FROM ALL SOURCE NODES TO A GIVEN DESTINATION IN GENERAL NETWORKS* 526 AN ALGORITHM FOR FINDING SHORTEST ROUTES FROM ALL SOURCE NODES TO A GIVEN DESTINATION IN GENERAL NETWORKS* By JIN Y. YEN (University of California, Berkeley) Summary. This paper presents an algorithm

More information

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE 6.21 DYNAMIC PROGRAMMING LECTURE LECTURE OUTLINE Deterministic finite-state DP problems Backward shortest path algorithm Forward shortest path algorithm Shortest path examples Alternative shortest path

More information

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE GÜNTER ROTE Abstract. A salesperson wants to visit each of n objects that move on a line at given constant speeds in the shortest possible time,

More information

Elif Özge Özdamar T Reinforcement Learning - Theory and Applications February 14, 2006

Elif Özge Özdamar T Reinforcement Learning - Theory and Applications February 14, 2006 On the convergence of Q-learning Elif Özge Özdamar elif.ozdamar@helsinki.fi T-61.6020 Reinforcement Learning - Theory and Applications February 14, 2006 the covergence of stochastic iterative algorithms

More information

EE266 Homework 5 Solutions

EE266 Homework 5 Solutions EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The

More information

A Recursive Partitioning Algorithm for Space Information Flow

A Recursive Partitioning Algorithm for Space Information Flow A Recursive Partitioning Algorithm for Space Information Flow Jiaqing Huang Department of Electronics and Information Engineering Huazhong University of Science and Technology, China jqhuang@mail.hust.edu.cn

More information

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018

Lecture 2: Making Good Sequences of Decisions Given a Model of World. CS234: RL Emma Brunskill Winter 2018 Lecture 2: Making Good Sequences of Decisions Given a Model of World CS234: RL Emma Brunskill Winter 218 Human in the loop exoskeleton work from Steve Collins lab Class Structure Last Time: Introduction

More information

PERSONNEL POLICIES AND PROCEDURES Personnel Policy Adopted by Res.:

PERSONNEL POLICIES AND PROCEDURES Personnel Policy Adopted by Res.: EMPLOYEE RECOGNITION AND DEVELOPMENT PERSONNEL POLICIES AND PROCEDURES Personnel Policy Adopted by Res.: 1218-2008 Personnel Procedures Approved: 10/4/2013 1. PURPOSE: The purpose of this policy is to

More information

SOLVING ROBUST SUPPLY CHAIN PROBLEMS

SOLVING ROBUST SUPPLY CHAIN PROBLEMS SOLVING ROBUST SUPPLY CHAIN PROBLEMS Daniel Bienstock Nuri Sercan Özbay Columbia University, New York November 13, 2005 Project with Lucent Technologies Optimize the inventory buffer levels in a complicated

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Temporal Abstraction in RL. Outline. Example. Markov Decision Processes (MDPs) ! Options

Temporal Abstraction in RL. Outline. Example. Markov Decision Processes (MDPs) ! Options Temporal Abstraction in RL Outline How can an agent represent stochastic, closed-loop, temporally-extended courses of action? How can it act, learn, and plan using such representations?! HAMs (Parr & Russell

More information

Programming for Engineers in Python

Programming for Engineers in Python Programming for Engineers in Python Lecture 12: Dynamic Programming Autumn 2011-12 1 Lecture 11: Highlights GUI (Based on slides from the course Software1, CS, TAU) GUI in Python (Based on Chapter 19 from

More information

Introduction to Dynamic Programming

Introduction to Dynamic Programming Introduction to Dynamic Programming http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html Acknowledgement: this slides is based on Prof. Mengdi Wang s and Prof. Dimitri Bertsekas lecture notes Outline 2/65 1

More information

Neuro-Dynamic Programming for Fractionated Radiotherapy Planning

Neuro-Dynamic Programming for Fractionated Radiotherapy Planning Neuro-Dynamic Programming for Fractionated Radiotherapy Planning Geng Deng Michael C. Ferris University of Wisconsin at Madison Conference on Optimization and Health Care, Feb, 2006 Background Optimal

More information

Temporal Abstraction in RL

Temporal Abstraction in RL Temporal Abstraction in RL How can an agent represent stochastic, closed-loop, temporally-extended courses of action? How can it act, learn, and plan using such representations? HAMs (Parr & Russell 1998;

More information

THE NON - STOCK EXCHANGE DEALS OPTIMIZATION USING NETFLOW METHOD. V.B.Gorsky, V.P.Stepanov. Saving Bank of Russian Federation,

THE NON - STOCK EXCHANGE DEALS OPTIMIZATION USING NETFLOW METHOD. V.B.Gorsky, V.P.Stepanov. Saving Bank of Russian Federation, THE NON - STOCK EXCHANGE DEALS OPTIMIZATION USING NETFLOW METHOD. V.B.Gorsky, V.P.Stepanov. Saving Bank of Russian Federation, e-mail: dwhome@sbrf.ru Abstract. We would like to present the solution of

More information

A Branch-and-Price method for the Multiple-depot Vehicle and Crew Scheduling Problem

A Branch-and-Price method for the Multiple-depot Vehicle and Crew Scheduling Problem A Branch-and-Price method for the Multiple-depot Vehicle and Crew Scheduling Problem SCIP Workshop 2018, Aachen Markó Horváth Tamás Kis Institute for Computer Science and Control Hungarian Academy of Sciences

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

Lecture 7: Bayesian approach to MAB - Gittins index

Lecture 7: Bayesian approach to MAB - Gittins index Advanced Topics in Machine Learning and Algorithmic Game Theory Lecture 7: Bayesian approach to MAB - Gittins index Lecturer: Yishay Mansour Scribe: Mariano Schain 7.1 Introduction In the Bayesian approach

More information

Tug of War Game. William Gasarch and Nick Sovich and Paul Zimand. October 6, Abstract

Tug of War Game. William Gasarch and Nick Sovich and Paul Zimand. October 6, Abstract Tug of War Game William Gasarch and ick Sovich and Paul Zimand October 6, 2009 To be written later Abstract Introduction Combinatorial games under auction play, introduced by Lazarus, Loeb, Propp, Stromquist,

More information

Travel Policy and Procedures

Travel Policy and Procedures Travel Policy and Procedures Published and Maintained by the Office of Business Services POLICY STATEMENT In order to best utilize the resources available to the school district, employees are expected

More information

Mengdi Wang. July 3rd, Laboratory for Information and Decision Systems, M.I.T.

Mengdi Wang. July 3rd, Laboratory for Information and Decision Systems, M.I.T. Practice July 3rd, 2012 Laboratory for Information and Decision Systems, M.I.T. 1 2 Infinite-Horizon DP Minimize over policies the objective cost function J π (x 0 ) = lim N E w k,k=0,1,... DP π = {µ 0,µ

More information

is a path in the graph from node i to node i k provided that each of(i i), (i i) through (i k; i k )isan arc in the graph. This path has k ; arcs in i

is a path in the graph from node i to node i k provided that each of(i i), (i i) through (i k; i k )isan arc in the graph. This path has k ; arcs in i ENG Engineering Applications of OR Fall 998 Handout The shortest path problem Consider the following problem. You are given a map of the city in which you live, and you wish to gure out the fastest route

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives

More information

Lecture outline W.B.Powell 1

Lecture outline W.B.Powell 1 Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous

More information

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE 6.21 DYNAMIC PROGRAMMING LECTURE LECTURE OUTLINE Deterministic finite-state DP problems Backward shortest path algorithm Forward shortest path algorithm Shortest path examples Alternative shortest path

More information

All out-of-state travel for which reimbursement will be claimed shall be approved in advance by the Board.

All out-of-state travel for which reimbursement will be claimed shall be approved in advance by the Board. Business and Non-instructional Operations BP 3350 (a) Travel Expenses The Governing Board recognizes that district employees may incur expenses in the course of performing their assigned duties and responsibilities.

More information

6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE 6.231 DYNAMIC PROGRAMMING LECTURE 5 LECTURE OUTLINE Stopping problems Scheduling problems Minimax Control 1 PURE STOPPING PROBLEMS Two possible controls: Stop (incur a one-time stopping cost, and move

More information

Effective dates of rates are shown in parentheses following section titles. 1. PERSONAL VEHICLE MILEAGE REIMBURSEMENT RATE.

Effective dates of rates are shown in parentheses following section titles. 1. PERSONAL VEHICLE MILEAGE REIMBURSEMENT RATE. INTRODUCTION This section SAAM establishes policies and procedures for travel-related matters that are infrequently encountered. All rates cited are for reimbursement of actual costs or mileage incurred

More information

Staff Expenses Policy

Staff Expenses Policy Staff Expenses Policy Review cycle Policy Oversight Body RPE Committee Policy Reviewed (date) 29/09/2016 Next Review (date) September 2017 Signed (ALT Business Manager) Date Signed (RPE Chair) Date Stokesley

More information

Competitive Market Model

Competitive Market Model 57 Chapter 5 Competitive Market Model The competitive market model serves as the basis for the two different multi-user allocation methods presented in this thesis. This market model prices resources based

More information

Robust Dual Dynamic Programming

Robust Dual Dynamic Programming 1 / 18 Robust Dual Dynamic Programming Angelos Georghiou, Angelos Tsoukalas, Wolfram Wiesemann American University of Beirut Olayan School of Business 31 May 217 2 / 18 Inspired by SDDP Stochastic optimization

More information

Regret Minimization and Correlated Equilibria

Regret Minimization and Correlated Equilibria Algorithmic Game heory Summer 2017, Week 4 EH Zürich Overview Regret Minimization and Correlated Equilibria Paolo Penna We have seen different type of equilibria and also considered the corresponding price

More information

MEDICAID TRANSPORTATION PROGRAM POLICY REGARDING REIMBURSEMENT OF TRAVEL-RELATED EXPENSES

MEDICAID TRANSPORTATION PROGRAM POLICY REGARDING REIMBURSEMENT OF TRAVEL-RELATED EXPENSES MEDICAID TRANSPORTATION PROGRAM POLICY REGARDING REIMBURSEMENT OF TRAVEL-RELATED EXPENSES The policy included in this Manual is designed to guide the Department s contracted Medicaid Transportation Managers

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the reward function Must (learn to) act so as to maximize expected rewards Grid World The agent

More information

CS221 / Spring 2018 / Sadigh. Lecture 7: MDPs I

CS221 / Spring 2018 / Sadigh. Lecture 7: MDPs I CS221 / Spring 2018 / Sadigh Lecture 7: MDPs I cs221.stanford.edu/q Question How would you get to Mountain View on Friday night in the least amount of time? bike drive Caltrain Uber/Lyft fly CS221 / Spring

More information

Lecture 7: MDPs I. Question. Course plan. So far: search problems. Uncertainty in the real world

Lecture 7: MDPs I. Question. Course plan. So far: search problems. Uncertainty in the real world Lecture 7: MDPs I cs221.stanford.edu/q Question How would you get to Mountain View on Friday night in the least amount of time? bike drive Caltrain Uber/Lyft fly CS221 / Spring 2018 / Sadigh CS221 / Spring

More information

Introduction to Numerical Methods (Algorithm)

Introduction to Numerical Methods (Algorithm) Introduction to Numerical Methods (Algorithm) 1 2 Example: Find the internal rate of return (IRR) Consider an investor who pays CF 0 to buy a bond that will pay coupon interest CF 1 after one year and

More information

56:171 Operations Research Homework #8 Solution -- Fall Estimated resale price A: Private $ $600 B: Dealer $

56:171 Operations Research Homework #8 Solution -- Fall Estimated resale price A: Private $ $600 B: Dealer $ 56:171 Operations Research Homework #8 Solution -- Fall 2002 1. Decision Analysis (an exercise from Operations Research: a Practical Introduction, by M. Carter & C. Price) Suppose that you are in the position

More information

EE365: Markov Decision Processes

EE365: Markov Decision Processes EE365: Markov Decision Processes Markov decision processes Markov decision problem Examples 1 Markov decision processes 2 Markov decision processes add input (or action or control) to Markov chain with

More information

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming Dynamic Programming: An overview These notes summarize some key properties of the Dynamic Programming principle to optimize a function or cost that depends on an interval or stages. This plays a key role

More information

Sequential Coalition Formation for Uncertain Environments

Sequential Coalition Formation for Uncertain Environments Sequential Coalition Formation for Uncertain Environments Hosam Hanna Computer Sciences Department GREYC - University of Caen 14032 Caen - France hanna@info.unicaen.fr Abstract In several applications,

More information

Utility Indifference Pricing and Dynamic Programming Algorithm

Utility Indifference Pricing and Dynamic Programming Algorithm Chapter 8 Utility Indifference ricing and Dynamic rogramming Algorithm In the Black-Scholes framework, we can perfectly replicate an option s payoff. However, it may not be true beyond the Black-Scholes

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Hierarchical Reinforcement Learning Action hierarchy, hierarchical RL, semi-mdp Vien Ngo Marc Toussaint University of Stuttgart Outline Hierarchical reinforcement learning Learning

More information

CITY OF MANTECA EMPLOYEE TRAVEL / EXPENSE REIMBURSEMENT POLICY & PROCEDURE

CITY OF MANTECA EMPLOYEE TRAVEL / EXPENSE REIMBURSEMENT POLICY & PROCEDURE 1. PURPOSE 2. POLICY The purpose of this policy is to establish guidelines for the expenditure of public funds for authorizing attendance, travel, and reimbursement of expenses for City employees attending

More information

Borger Independent School District Travel Guidelines

Borger Independent School District Travel Guidelines Meals An employee must claim the actual expenses incurred for meals not to exceed the maximum allowable rates. District travel expense reimbursement is not a per diem. The maximum should not be claimed

More information

CSE202: Algorithm Design and Analysis. Ragesh Jaiswal, CSE, UCSD

CSE202: Algorithm Design and Analysis. Ragesh Jaiswal, CSE, UCSD Fractional knapsack Problem Fractional knapsack: You are a thief and you have a sack of size W. There are n divisible items. Each item i has a volume W (i) and a total value V (i). Design an algorithm

More information

Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach. K. Sudhir MGT 756: Empirical Methods in Marketing

Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach. K. Sudhir MGT 756: Empirical Methods in Marketing Lec 1: Single Agent Dynamic Models: Nested Fixed Point Approach K. Sudhir MGT 756: Empirical Methods in Marketing RUST (1987) MODEL AND ESTIMATION APPROACH A Model of Harold Zurcher Rust (1987) Empirical

More information

STP Problem Set 3 Solutions

STP Problem Set 3 Solutions STP 425 - Problem Set 3 Solutions 4.4) Consider the separable sequential allocation problem introduced in Sections 3.3.3 and 4.6.3, where the goal is to maximize the sum subject to the constraints f(x

More information

Non-Deterministic Search

Non-Deterministic Search Non-Deterministic Search MDP s 1 Non-Deterministic Search How do you plan (search) when your actions might fail? In general case, how do you plan, when the actions have multiple possible outcomes? 2 Example:

More information

3. The Dynamic Programming Algorithm (cont d)

3. The Dynamic Programming Algorithm (cont d) 3. The Dynamic Programming Algorithm (cont d) Last lecture e introduced the DPA. In this lecture, e first apply the DPA to the chess match example, and then sho ho to deal ith problems that do not match

More information

MATH60082 Example Sheet 6 Explicit Finite Difference

MATH60082 Example Sheet 6 Explicit Finite Difference MATH68 Example Sheet 6 Explicit Finite Difference Dr P Johnson Initial Setup For the explicit method we shall need: All parameters for the option, such as X and S etc. The number of divisions in stock,

More information

17 MAKING COMPLEX DECISIONS

17 MAKING COMPLEX DECISIONS 267 17 MAKING COMPLEX DECISIONS The agent s utility now depends on a sequence of decisions In the following 4 3grid environment the agent makes a decision to move (U, R, D, L) at each time step When the

More information

A distributed Laplace transform algorithm for European options

A distributed Laplace transform algorithm for European options A distributed Laplace transform algorithm for European options 1 1 A. J. Davies, M. E. Honnor, C.-H. Lai, A. K. Parrott & S. Rout 1 Department of Physics, Astronomy and Mathematics, University of Hertfordshire,

More information

MAS115: R programming Lecture 3: Some more pseudo-code and Monte Carlo estimation Lab Class: for and if statements, input

MAS115: R programming Lecture 3: Some more pseudo-code and Monte Carlo estimation Lab Class: for and if statements, input MAS115: R programming Lecture 3: Some more pseudo-code and Monte Carlo estimation Lab Class: for and if statements, input The University of Sheffield School of Mathematics and Statistics Aims Introduce

More information

What is Greedy Approach? Control abstraction for Greedy Method. Three important activities

What is Greedy Approach? Control abstraction for Greedy Method. Three important activities 0-0-07 What is Greedy Approach? Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each decision is locally optimal. These locally optimal solutions will finally

More information

Introduction to Fall 2007 Artificial Intelligence Final Exam

Introduction to Fall 2007 Artificial Intelligence Final Exam NAME: SID#: Login: Sec: 1 CS 188 Introduction to Fall 2007 Artificial Intelligence Final Exam You have 180 minutes. The exam is closed book, closed notes except a two-page crib sheet, basic calculators

More information

0/1 knapsack problem knapsack problem

0/1 knapsack problem knapsack problem 1 (1) 0/1 knapsack problem. A thief robbing a safe finds it filled with N types of items of varying size and value, but has only a small knapsack of capacity M to use to carry the goods. More precisely,

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

Algorithmic Game Theory

Algorithmic Game Theory Algorithmic Game Theory Lecture 10 06/15/10 1 A combinatorial auction is defined by a set of goods G, G = m, n bidders with valuation functions v i :2 G R + 0. $5 Got $6! More? Example: A single item for

More information

CS 234 Winter 2019 Assignment 1 Due: January 23 at 11:59 pm

CS 234 Winter 2019 Assignment 1 Due: January 23 at 11:59 pm CS 234 Winter 2019 Assignment 1 Due: January 23 at 11:59 pm For submission instructions please refer to website 1 Optimal Policy for Simple MDP [20 pts] Consider the simple n-state MDP shown in Figure

More information

Lecture 8: Linear Prediction: Lattice filters

Lecture 8: Linear Prediction: Lattice filters 1 Lecture 8: Linear Prediction: Lattice filters Overview New AR parametrization: Reflection coefficients; Fast computation of prediction errors; Direct and Inverse Lattice filters; Burg lattice parameter

More information

Unobserved Heterogeneity Revisited

Unobserved Heterogeneity Revisited Unobserved Heterogeneity Revisited Robert A. Miller Dynamic Discrete Choice March 2018 Miller (Dynamic Discrete Choice) cemmap 7 March 2018 1 / 24 Distributional Assumptions about the Unobserved Variables

More information

Travel, Lodging and Meals

Travel, Lodging and Meals Authority: History: Source of Authority: 05.180 Related Links: Responsible Office: Travel, Lodging and Meals Vice Chancellor for Business Affairs Effective December 15, 1991; updated July 1, 1995; updated

More information

arxiv: v1 [math.pr] 6 Apr 2015

arxiv: v1 [math.pr] 6 Apr 2015 Analysis of the Optimal Resource Allocation for a Tandem Queueing System arxiv:1504.01248v1 [math.pr] 6 Apr 2015 Liu Zaiming, Chen Gang, Wu Jinbiao School of Mathematics and Statistics, Central South University,

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Markov Decision Processes II Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC

More information

TRAVEL POLICY AND EXPENSE REPORTING BOARD OF DIRECTORS POLICY AND PROCEDURE NUMBER 032

TRAVEL POLICY AND EXPENSE REPORTING BOARD OF DIRECTORS POLICY AND PROCEDURE NUMBER 032 TRAVEL POLICY AND EXPENSE REPORTING BOARD OF DIRECTORS POLICY AND PROCEDURE NUMBER 032 APPROVED BY CRRA BOARD OF DIRECTORS SEPTEMBER 29, 2005 TABLE OF CONTENTS 1. GENERAL STATEMENT... 1 2. APPROVALS...

More information

Dynamic Admission and Service Rate Control of a Queue

Dynamic Admission and Service Rate Control of a Queue Dynamic Admission and Service Rate Control of a Queue Kranthi Mitra Adusumilli and John J. Hasenbein 1 Graduate Program in Operations Research and Industrial Engineering Department of Mechanical Engineering

More information

The exam is closed book, closed calculator, and closed notes except your three crib sheets.

The exam is closed book, closed calculator, and closed notes except your three crib sheets. CS 188 Spring 2016 Introduction to Artificial Intelligence Final V2 You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your three crib sheets.

More information

Slides credited from Hsu-Chun Hsiao

Slides credited from Hsu-Chun Hsiao Slides credited from Hsu-Chun Hsiao Greedy Algorithms Greedy #1: Activity-Selection / Interval Scheduling Greedy #2: Coin Changing Greedy #3: Fractional Knapsack Problem Greedy #4: Breakpoint Selection

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Q1. [?? pts] Search Traces

Q1. [?? pts] Search Traces CS 188 Spring 2010 Introduction to Artificial Intelligence Midterm Exam Solutions Q1. [?? pts] Search Traces Each of the trees (G1 through G5) was generated by searching the graph (below, left) with a

More information

1 Online Problem Examples

1 Online Problem Examples Comp 260: Advanced Algorithms Tufts University, Spring 2018 Prof. Lenore Cowen Scribe: Isaiah Mindich Lecture 9: Online Algorithms All of the algorithms we have studied so far operate on the assumption

More information

arxiv: v1 [q-fin.rm] 1 Jan 2017

arxiv: v1 [q-fin.rm] 1 Jan 2017 Net Stable Funding Ratio: Impact on Funding Value Adjustment Medya Siadat 1 and Ola Hammarlid 2 arxiv:1701.00540v1 [q-fin.rm] 1 Jan 2017 1 SEB, Stockholm, Sweden medya.siadat@seb.se 2 Swedbank, Stockholm,

More information

Equilibrium Selection in Multistage Congestion Games for Real-Time Streaming

Equilibrium Selection in Multistage Congestion Games for Real-Time Streaming Equilibrium Selection in Multistage Congestion Games for Real-Time Streaming Giovanni Rossi Dept. of Computer Science, University of Bologna Mura Anteo Zamboni 7, 40127 Bologna, Italy giorossi@cs.unibo.it

More information

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence CSE 473: Artificial Intelligence Markov Decision Processes (MDPs) Luke Zettlemoyer Many slides over the course adapted from Dan Klein, Stuart Russell or Andrew Moore 1 Announcements PS2 online now Due

More information

February 2 Math 2335 sec 51 Spring 2016

February 2 Math 2335 sec 51 Spring 2016 February 2 Math 2335 sec 51 Spring 2016 Section 3.1: Root Finding, Bisection Method Many problems in the sciences, business, manufacturing, etc. can be framed in the form: Given a function f (x), find

More information

Complex Decisions. Sequential Decision Making

Complex Decisions. Sequential Decision Making Sequential Decision Making Outline Sequential decision problems Value iteration Policy iteration POMDPs (basic concepts) Slides partially based on the Book "Reinforcement Learning: an introduction" by

More information

CS 188: Artificial Intelligence. Outline

CS 188: Artificial Intelligence. Outline C 188: Artificial Intelligence Markov Decision Processes (MDPs) Pieter Abbeel UC Berkeley ome slides adapted from Dan Klein 1 Outline Markov Decision Processes (MDPs) Formalism Value iteration In essence

More information

TRANSPORTATION REIMBURSEMENT PROGRAM GUIDE

TRANSPORTATION REIMBURSEMENT PROGRAM GUIDE TRANSPORTATION REIMBURSEMENT PROGRAM GUIDE APRIL 2016 TABLE OF CONTENTS Program Overview Page 3 Program Rules Page 3 Requesting Your US Bank Debit Card Page 4 Scheduling Your Transportation Request Page

More information

SECTION 6: TRAVEL POLICIES AND PROCEDURES

SECTION 6: TRAVEL POLICIES AND PROCEDURES SECTION 6: TRAVEL POLICIES AND PROCEDURES 6.1 Policies/Definitions 6.2 Travel Requests and Advances 6.3 Use of County Credit Cards 6.4 Travel Claims and Reimbursement 6.5 Transportation 6.6 Meals and Per

More information

Chapter wise Question bank

Chapter wise Question bank GOVERNMENT ENGINEERING COLLEGE - MODASA Chapter wise Question bank Subject Name Analysis and Design of Algorithm Semester Department 5 th Term ODD 2015 Information Technology / Computer Engineering Chapter

More information

SECTION 8 TRAVEL PROCEDURES

SECTION 8 TRAVEL PROCEDURES SECTION 8 TRAVEL PROCEDURES The practices and procedures regarding travel by Athens State University employees have been developed in accordance with Alabama law and Internal Revenue Service regulations

More information

Maximum Contiguous Subsequences

Maximum Contiguous Subsequences Chapter 8 Maximum Contiguous Subsequences In this chapter, we consider a well-know problem and apply the algorithm-design techniques that we have learned thus far to this problem. While applying these

More information

Action Selection for MDPs: Anytime AO* vs. UCT

Action Selection for MDPs: Anytime AO* vs. UCT Action Selection for MDPs: Anytime AO* vs. UCT Blai Bonet 1 and Hector Geffner 2 1 Universidad Simón Boĺıvar 2 ICREA & Universitat Pompeu Fabra AAAI, Toronto, Canada, July 2012 Online MDP Planning and

More information