Handout 4: Deterministic Systems and the Shortest Path Problem

Similar documents
6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

6.231 DYNAMIC PROGRAMMING LECTURE 3 LECTURE OUTLINE

Handout 8: Introduction to Stochastic Dynamic Programming. 2 Examples of Stochastic Dynamic Programming Problems

Dynamic Programming: An overview. 1 Preliminaries: The basic principle underlying dynamic programming

IEOR E4004: Introduction to OR: Deterministic Models

Essays on Some Combinatorial Optimization Problems with Interval Data

Sequential Decision Making

Homework solutions, Chapter 8

56:171 Operations Research Midterm Exam Solutions October 22, 1993

Introduction to Dynamic Programming

Elif Özge Özdamar T Reinforcement Learning - Theory and Applications February 14, 2006

THE TRAVELING SALESMAN PROBLEM FOR MOVING POINTS ON A LINE

Chapter 21. Dynamic Programming CONTENTS 21.1 A SHORTEST-ROUTE PROBLEM 21.2 DYNAMIC PROGRAMMING NOTATION

Optimization Methods. Lecture 16: Dynamic Programming

6/7/2018. Overview PERT / CPM PERT/CPM. Project Scheduling PERT/CPM PERT/CPM

0/1 knapsack problem knapsack problem

Advanced Operations Research Prof. G. Srinivasan Dept of Management Studies Indian Institute of Technology, Madras

Program Evaluation and Review Techniques (PERT) Critical Path Method (CPM):

CHAPTER 5: DYNAMIC PROGRAMMING

Yao s Minimax Principle

Problem Set 2: Answers

AN ALGORITHM FOR FINDING SHORTEST ROUTES FROM ALL SOURCE NODES TO A GIVEN DESTINATION IN GENERAL NETWORKS*

Lecture 17: More on Markov Decision Processes. Reinforcement learning

CHAPTER 6 CRASHING STOCHASTIC PERT NETWORKS WITH RESOURCE CONSTRAINED PROJECT SCHEDULING PROBLEM

Dynamic Programming (DP) Massimo Paolucci University of Genova

Notes on the EM Algorithm Michael Collins, September 24th 2005

MBF1413 Quantitative Methods

Appendix A Decision Support Analysis

Deterministic Dynamic Programming

Chapter 11: PERT for Project Planning and Scheduling

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Lecture 10: The knapsack problem

Project Planning. Identifying the Work to Be Done. Gantt Chart. A Gantt Chart. Given: Activity Sequencing Network Diagrams

Chapter 15: Dynamic Programming

A convenient analytical and visual technique of PERT and CPM prove extremely valuable in assisting the managers in managing the projects.

Network Analysis Basic Components. The Other View. Some Applications. Continued. Goal of Network Analysis. RK Jana

Lecture 12: MDP1. Victor R. Lesser. CMPSCI 683 Fall 2010

a 13 Notes on Hidden Markov Models Michael I. Jordan University of California at Berkeley Hidden Markov Models The model

11/1/2018. Overview PERT / CPM. Network representation. Network representation. Project Scheduling. What is a path?

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Project Management. Chapter 2. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall

Markov Decision Processes: Making Decision in the Presence of Uncertainty. (some of) R&N R&N

Project Planning. Jesper Larsen. Department of Management Engineering Technical University of Denmark

LEC 13 : Introduction to Dynamic Programming

Computer Vision Group Prof. Daniel Cremers. 7. Sequential Data

Application of an Interval Backward Finite Difference Method for Solving the One-Dimensional Heat Conduction Problem

PORTFOLIO OPTIMIZATION AND EXPECTED SHORTFALL MINIMIZATION FROM HISTORICAL DATA

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class

Programme Evaluation and Review Techniques (PERT) And Critical Path Method (CPM) By K.K. Bandyopadhyay. August 2001

Unobserved Heterogeneity Revisited

is a path in the graph from node i to node i k provided that each of(i i), (i i) through (i k; i k )isan arc in the graph. This path has k ; arcs in i

1 Online Problem Examples

Dynamic Appointment Scheduling in Healthcare

Lecture 7: Bayesian approach to MAB - Gittins index

Final exam solutions

Optimization Prof. A. Goswami Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 18 PERT

56:171 Operations Research Midterm Examination Solutions PART ONE

Project Management. Project Mangement. ( Notes ) For Private Circulation Only. Prof. : A.A. Attarwala.

4 Reinforcement Learning Basic Algorithms

Node betweenness centrality: the definition.

Optimal Satisficing Tree Searches

Do all of Part One (1 pt. each), one from Part Two (15 pts.), and four from Part Three (15 pts. each) <><><><><> PART ONE <><><><><>

CEC login. Student Details Name SOLUTIONS

Information aggregation for timing decision making.

Introduction to Fall 2007 Artificial Intelligence Final Exam

1 of 14 4/27/2009 7:45 AM

6.231 DYNAMIC PROGRAMMING LECTURE 10 LECTURE OUTLINE

SCHEDULE CREATION AND ANALYSIS. 1 Powered by POeT Solvers Limited

Revenue Management Under the Markov Chain Choice Model

CS360 Homework 14 Solution

Project Management Chapter 13

Complex Decisions. Sequential Decision Making

Q1. [?? pts] Search Traces

Dynamic Resource Allocation for Spot Markets in Cloud Computi

TDT4171 Artificial Intelligence Methods

Pakes (1986): Patents as Options: Some Estimates of the Value of Holding European Patent Stocks

UNIT 2. Greedy Method GENERAL METHOD

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

MATH 4512 Fundamentals of Mathematical Finance

A Branch-and-Price method for the Multiple-depot Vehicle and Crew Scheduling Problem

Lecture Quantitative Finance Spring Term 2015

Lecture Note 3. Oligopoly

Markov Decision Processes II

The Value of Information in Central-Place Foraging. Research Report

More Advanced Single Machine Models. University at Buffalo IE661 Scheduling Theory 1

Single-Parameter Mechanisms

Scenario reduction and scenario tree construction for power management problems

arxiv: v1 [q-fin.rm] 1 Jan 2017

Logistics. Lecture notes. Maria Grazia Scutellà. Dipartimento di Informatica Università di Pisa. September 2015

Notes on Natural Logic

Analyzing Pricing and Production Decisions with Capacity Constraints and Setup Costs

PROJECT MANAGEMENT CPM & PERT TECHNIQUES

1 The Solow Growth Model

CHAPTER 9: PROJECT MANAGEMENT

56:171 Operations Research Midterm Exam Solutions October 19, 1994

Lecture outline W.B.Powell 1

MS-E2114 Investment Science Exercise 4/2016, Solutions

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 15 : DP Examples

Chapter 3. Dynamic discrete games and auctions: an introduction

CS 188: Artificial Intelligence. Outline

Transcription:

SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 4: Deterministic Systems and the Shortest Path Problem Instructor: Shiqian Ma January 27, 2014 Suggested Reading: Bertsekas Lecture Slides on Dynamic Programming; Sections 2.1 and 2.2 of Chapter 2 of Bertsekas, Dynamic Programming and Optimal Control: Volume I (3rd Edition), Athena Scientific, 2005. 1 Introduction In this handout, we will focus on deterministic finite state systems, i.e., systems in which the number of possible states in each time period is finite, and the parameter w k in each time period k can only take on one value. Note that both the N stage resource allocation problem and the operations scheduling problem are examples of deterministic systems. However, the former is not a finite state system, while the latter is. 2 Finite State Systems and Shortest Paths 2.1 Formulating a Deterministic Finite State Problem as a Shortest Path Problem Consider now a deterministic problem in which the number of possible states in each time period k is finite. Then, at any state S k, a control x k can be regarded as a transition from S k to the state Γ k (S k, x k, w k ) at a cost Λ k (S k, x k, w k ). In particular, we can use a graph to represent such a system. Each node corresponds to a possible state of the system, and an arc (or a directed edge) corresponds to a transition between states at successive stages. Furthermore, each arc is associated with a cost. To take care of the final stage, an artificial terminal node t is added and each node corresponding to a final stage state S N are connected to t via an arc of cost Λ N (S N ). See Figure 1 for an illustration. With the above setup, it is not hard to see that a control sequence x 0, x 1,..., x N 1 traces out a path originating at the initial state s and terminating at a node corresponding to the final stage in the transition diagram. Moreover, the cost associated with the control sequence is simply the sum of the costs on the arcs from s to t. Thus, if we view the costs on the arcs as distances, then the problem of finding a control sequence or policy that minimizes the cost function is equivalent to finding the shortest path from s to t in the transition diagram. Formally, let S k be the set of possible states in time period k. Let a k ij denote the cost of transition at stage k from state i S k to state j S k+1, and let a N it = Λ N(i) be the terminal cost of state i S N. Here, we adopt the convention that if there is no transition from state i S k to state j S k, then a k ij =. Using these notations, the dynamic programming algorithm for the deterministic finite state system takes the following form: J N (i) = a N it for all i S N, J k (i) = min a k ij + J k+1 (j) for all i S k, k = 0, 1,..., N 1. j S k+1 (1) 1

Figure 1: Transition Diagram of a Deterministic Finite State System The optimal cost is just J 0 (s) and is equal to the length of the shortest path from s to t. Just as in the standard dynamic programming algorithm, the above algorithm proceeds backward in time. However, it is easy to convert it into an algorithm that proceeds forward in time. The crucial observation is that an optimal path from s to t is also an optimal path from t to s if we reverse all the directions of the arcs in the transition graph. Specifically, the algorithm for this reverse problem starts from the set of states S 1 in stage 1, then proceeds to the set of states S 2 in stage 2 and so on, until the states S N in stage N are reached. Formally, the forward algorithm for the deterministic finite state system is as follows: J 1 (i) = a 0 si for all i S 1, J k (i) = min aji k 1 + J k 1 (j) for all i S k, k = 2,..., N. j S k 1 (2) The optimal cost is then given by J N+1 (t) = min a N it + J N (i). i S N Note that since the forward optimal path should coincide with the backward optimal path, we must have J 0 (s) = J N+1 (t). To further understand the forward algorithm, recall that J k (i) is the optimal cost to go from state i S k to state t. Hence, we may interpret J k (i) as the optimal cost to arrive to state i S k from state s. One of the advantages of the forward algorithm is that it does not require knowledge about the problem data in time periods k + 1,..., N when making the decision for time period k. Later, we will see how this comes to play in applications. 2

2.2 Formulating a Shortest Path Problem as a Deterministic Finite State Problem In the last sub section, we have seen that a deterministic finite state problem can be formulated as a special type of shortest path problem, in which the graph has no cycles. As it turns out, a general shortest path problem can also be formulated as a deterministic finite state problem. Consequently, one can apply the dynamic programming algorithm to solve the shortest path problem. To prove this result, let us introduce some preliminaries. Let V = 1, 2,..., N, t be the set of nodes of a graph, and let a ij be the cost of moving between nodes i and j. We assume that a ij = a ji, i.e., the cost of moving between nodes i and j does not depend on the direction. Moreover, we set a ij = if one cannot move between nodes i and j directly. The node t is designated the destination. The goal of the problem is to find a shortest path from each node i to the node t. In order for the problem to be well defined, we need to assume that there is no negative cycles in the graph, i.e., there does not exist a sequence of nodes j 1,..., j k such that a j1 j 2 + a j2 j 3 + + a jk 1 j k + a jk j 1 < 0. Under this assumption, all cycles have non negative costs, and it is clear that a shortest path need not take more than N moves. This motivates us to formulate the shortest path problem as an N stage dynamic programming problem, where each stage corresponds to a move in the graph, and we allow degenerate moves of the form i i, whose associated cost is a ii = 0. Now, let J k (i) = optimal cost of getting from i to t in N k moves. Then, the optimal cost of the path from i to t is J 0 (i). To apply the dynamic programming algorithm to this problem, we simply observe that J k (i) = min j 1,...,N a ij + J k+1 (j) for i = 1,..., N, k = 0, 1,..., N 2 J N 1 (i) = a it for i = 1,..., N. As an example, consider the following shortest path problem: Here, we have N = 4 and node 5 is the destination node. From the dynamic programming equations (3), we compute J 3 (1) = 2, J 3 (2) = 7, J 3 (3) = 5, J 3 (4) = 3, J 2 (1) = 2, J 2 (2) = 5.5, J 2 (3) = 4, J 2 (4) = 3, J 1 (1) = 2, J 1 (2) = 4.5, J 1 (3) = 4, J 1 (4) = 3, J 0 (1) = 2, J 0 (2) = 4.5, J 0 (3) = 4, J 0 (4) = 3. The optimal path from, e.g., node 3 to node 5 can then be read off by tracing the above computation: i.e., the optimal path is 3 4 5. J 0 (3) = a 33 + J 1 (3) node on path = 3 = a 33 + J 2 (3) node on path = 3 = a 34 + J 3 (4) node on path = 4 = 1 + 3 node on path = 5, 3 (3)

Figure 2: A Shortest Path Problem with N = 4 and t = 5 3 The Critical Path Analysis Consider the problem of arranging a large project. There are many tasks to complete before the entire project can be completed. Each task has a duration to finish. Different tasks may have complicated precedence relationships. We may construct a graph where each arc represents a task, with the duration being the weight on the arc. Also, there is a task symbolizing the start of the project; let the node be s. And similarly there is a task symbolizing the finish of the project; let the node be t. Suppose that (i, j) is a task (arc), and the duration of the task is t ij. The longest path from s to t is called a critical path. All the jobs on critical paths are called critical tasks. The length of critical paths is the minimum total completion time of the project. Let us denote it to be C. The longest path from s to i is the earliest start time when the task (i, j) must be started without affecting the completion of the entire project. Let this time be E i. Similarly, the longest path from j to t is the latest time when the task must be finished without affecting the completion of the entire project. Let it be L j. If C = E i + L j + t ij, then the task (i, j) is critical. In general, C (E i + L j + t ij ) is the slack time for the task (i, j). It is clear that computing E i can be done by forward dynamic programming; and computing L j can be done by backward dynamic programming. 4 Hidden Markov Models In many applications, the actual state may not be exactly observable. Instead, one may receive some signals, suggesting the likelihood of the possible states. Let X N = x 0, x 1,..., x N be the true states undergone. As a transition takes place, a signal will be transmitted. Suppose that the observed signals are Z N = z 1,..., z N. The question is: how can we determine the true states x N from the observed signals z N? Suppose that the probability of a transition from x i to x j given z k is 4

Figure 3: Critical Path Analysis r(z k ; x i, x j ). Suppose the probability of the initial state is P (x 0 ) = π x0. So what is the maximum likelihood of X N, given Z N? Clearly, P (X N Z N ) = P (X N, Z N ). P (Z N ) To maximize the likelihood, we need to find X N maximizing P (X N, Z N ). In fact we can establish Now P (X N, Z N ) = π x0 ln P (X N, Z N ) = ln π x0 + N k=1 p xk 1,x k r(z k ; x k 1, x k ). N ( ln pxk 1,x k + ln r(z k ; x k 1, x k ) ). k=1 The most likely sequence of the hidden states can be found by forward dynamic programming (to find the longest path). This approach is known as the Viterbi algorithm proposed by Andrew Viterbi in 1967. Figure 4: Hidden Markov Models 5